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FUSION POLYPEPTIDES 
FIELD OF THE INVENTION 
The present invention relates to non-naturally occurring fusion polypeptides containing 
N-terminal portions cleavable by dipeptidylpeptidase IV (DPP IV). 
5 BACKGROUND OF THE INVENTION 

The techniques of molecular biology, specifically recombinant DNA technology, allow 
for the production of relatively large quantities of desirable biologically active polypeptides. 
Furthermore, the genetic information encoding the polypeptides may be modified to produce 
relatively large quantities of modified polypeptides. Modifications made to the polypeptides are 

10 often used to improve their activity or facilitate their production and/or preparation. 

Accordingly, much effort has been made to determine what modifications are desirable in order 
to increase, enhance or otherwise alter the biological activity of desired polypeptides. In 
addition, there is a great deal of work being done to modify desired polypeptides to facilitate 
their production and purification. 

15 Naturally produced polypeptides are often initially biosynthesized as larger precursors 

which are then trimmed by a series of proteolytic cleavages to produce the final products. 
Accordingly, several proteases exist which recognize and cleave specific amino acids and/or 
amino acid sequences. These proteases participate in a conversion of a precursor protein to the 
final polypeptide product. 

20 Once such protease is dipeptidylpeptidase IV (DPP IV) (EC 3.4.14.5). DPP IV was 

first reported in Hopsu-Havu, V.K. and G.G. Glenner, Histo. Chemie 3:197-201 (1966) and 
has been shown to be present in many mammalian tissues. DPP IV is presently commercially 
available from Enzyme Systems Products (Dublin, California). DPP IV recognizes specific 
amino acid sequences on the N-terminus of proteins. Specifically, DPP IV will cleave a 

25 dipeptide from the N-terminus when the second amino acid from the N-terminus is proline 
(Pro), hydroxyproline (Hyp), alanine (Ala), serine (Ser), and threonine (Thr) and any amino 
acid is at the N-terminus residue position provided if proline or hydroxyproline is not the amino 
acid residue third from the N-terminus. DPP IV activity is more efficient when proline or 
alanine is the second amino acid from the N-terminus and is usually most efficient when that 

30 position is occupied by proline. The activity of DPP IV in the stepwise cleavage of "PRO" 
parts of precursors of naturally occurring peptides is widely reported. 

Modem technology has made possible the high level production of biologically active 
proteins. Important polypeptides can be synthesized using peptide synthesizers or in host cells 
using recombinant DNA technology. Often, biologically active proteins are administered as 

35 drugs. Numerous examples exist in which active proteins are used as therapeutics, prophylactics 
or to enhance or repress traits. Since DPP IV and other proteases degrade proteins, these drugs 
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are susceptible to .degradation. Thus, a problem of using biologically active polypeptides as 
drugs is that their sustained presence is diminished and they must therefore be administered 
more frequently. 

Hie vapid developments in recombinant DNA methodology which have allowed the 
5 production of polypeptides, proteins, and their analogs in large quantities in a very short period 
of time have created a.needto.isolate in highly efficient and predictable mannersjhese proteins 
from complex mixtures including the total amount of protein produced by the host cells and 
those in the growth medium. The purification of heterologous polypeptides produced by host 
cells can be very expensive and can cause denaturation of the protein product itself. An 
10 overview of protein purification techniques is provided in the Background Art section of U.S. 
Patent Number 4,782,137 issued Nov. 1, 1988 to Hopp et al., incorporated herein by 
reference. 

To circumvent the limitations in the art and provide better methods, recombinant DNA 
technology may be used to provide desired polypeptides in the form of non-naturally occuring 

15 proteins which contain a linker peptide that may be used as a ligand or other target for 

purification means. For example, U.S. patent number 4,782,137 relates to synthesis of a non- 
naturally occuring peptide containing an antigenic linker peptide. The non-naturally occuring 
protein can be passed through a column containing immobilized antibodies which bind to the 
antigenic linker, thus isolating the non-naturally occuring protein. U.S. Patent 4,569,794 

20 relates to a process of purifying non-naturally occuring proteins which contain N-terminal 
extensions that have an affinity for immobilized metals. The non-naturally occuring proteins 
bind to immobilized metal ions in a column. One problem with these methods is that the linker 
peptide is often undesirable and removal of the linker can be difficult. 

The present invention relates to non-naturally occurring fusion proteins which comprise 

25 a core protein portion and an N-terminal extension which is cleavable by DPP IV. According 
to the present invention, a non-naturally occuring protein is provided wherein the extension 
attached to the core protein is not an N-terminal extension that occurs in nature attached to the 
core peptide; hence, non-naturally occurring fusion protein. The present invention relates to 
prodrugs which are DPP IV cleavable non-naturally occuring proteins wherein the core protein 

30 portion is a biologically active protein. The present invention relates to DPP IV cleavable non- 
naturally occuring proteins useful in purification methods whereby the N-terminal extension 
provides a feature or property which facilitates purification of the non-naturally occuring 
protein. 

The present invention provides non-naturally occuring proteins which have N-terminal 
35 extensions that are cleavable by DPP IV such that exposure of the non-naturally occuring 
protein to DPP IV results in conversion of the non-naturally occuring protein to a desirable 
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protein. When used as a prodrug, the non-naturally occuring protein is processed into a 
biologically active protein in vivo using DPP IV present in the target species. When used in a 
purification process, non-naturally occuring protein can be purified by using its specifically 
designed N-terminus as a ligand and then processed with DPP IV to remove the N-terminal 
5 extension and liberate a desired protein. 

The present invention allows for the production of a desired protein as a non-naturally 
occuring protein that is later converted to the desired protein when exposed to DPP IV. 
Prodrugs are converted to drugs over a course of time using the patients* endogenous DPP IV, 
thereby achieving sustained presence of the active drug in a patient and reducing the frequency 
10 of administration. Pure desired proteins can be isolated using the present invention by 

producing and purifying non-naturally occuring proteins and then processing the non-naturally 
occuring proteins in vitro with DPP IV to produce the desired protein. 

SUMMARY OF THE INVENTION 
The present invention relates to a non-naturally occurring fusion protein comprising an 
15 extension peptide portion covalently linked at its C-terminus to the N-terminus of a core protein 
portion, said extension peptide portion being of the formula: 
A-X-Y(X'-Y) n 

wherein 

A is optional and when present is methionine; 
20 n is 0-20; 

X is selected from the group consisting of all naturally occurring amino acid residues; 
X' is selected from the group consisting of all naturally occurring amino acid residues 
except proline and hydroxyproline; 

Y is selected from the group consisting of proline, hydroxyproline, alanine, serine and 
25 threonine except when n is 0 then Y is selected from the group consisting of alanine, serine and 
threonine. 

The present invention also relates to the use of such non-naturally occuring proteins in 
medicinal preparations and to a method of purifying desired proteins from a mixture containing 
such non-naturally occuring proteins and impurities comprising the steps of selectively 
30 contacting said non-naturally occuring protein with material which immobilizes said non- 
naturally occuring protein, removing said impurities, separating said non-naturally occuring 
proteins from said material, contacting said non-naturally occuring protein with DPP IV, and 
isolating said desired protein. 

INFORMATION DISCLOSURE 
35 U.S. Patent No. 4,569,794 issued February 11, 1986 to Smith et al relates to the 

process of purifying proteins and compounds useful in such processes. The invention describes 
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a process of isolating fusion proteins which have biologically active polypeptides at the C- 
terminal end and an N-terminal extension linker that is a metal ion chelating linker. The fusion 
peptide has an affinity to immobilized metal ions. Impurities can be removed by passing a 
mixture containing the fusion protein through a column containing immobilized metal ions. 
5 The fusion protein becomes associated with the metal ions and only the impurities are eluted. 
Upon changing conditions the fusion peptide is liberated from the immobilized metal ions thus 
resulting in purified fusion protein. 

U.S. Patent No. 4,782,137 issued November 1, 1988 to Hopp et al., discloses the 
synthesis of a fusion protein having a highly antigenic N-terminal portion and a desired 

10 polypeptide at the C-terminal portion. According to Hopp et al., the fusion proteins are 
purified from crude supernatant by passing crude supernatent through a column containing 
immobilized antibodies which recognize the antigenic portion of the fusion protein. The 
immobilized antibodies keep the protein in the column while the undesired components of the 
supernatent are eluted. The column conditions can then be changed to cause the antigen- 

15 antibody complex to dissociate. The fusion protein is then eluted and collected. 

U.S. Patent No. 4,734,399 issued March 29, 1988, to Felix, et al. relates to growth 
hormone releasing factor analogs. Several analogs are disclosed which have end terminals of 
Tyr-Ala and His-Ala. However, these molecules are not fusion proteins but rather core 
proteins only. The N-terminal dipeptides of Felix, et al. are part of the bGRF analog core 
20 molecule. 

European patent application Publication Number 0 220 958, published May 6, 1987 
relates to selective chemical removal of N-terminal residues. The invention relates to a process 
and compounds useful in the process to remove N-terminal residues from desired polypeptides. 
The desired polypeptide exists as a fusion protein having the desired polypeptide link at the N- 

25 terminal to a linker having the formula X-Pro. Upon exposure of the fusion protein to specific 
buffer conditions a diketopiperazine of the X-Pro portion of the fusion protein is formed and 
cleaved, thereby producing the desired polypeptide from the fusion precursor. The fusion 
proteins of EPO 220,958 f958) is not included in the present invention because according to 
the present invention, when the N-terminal extension is only a dipeptide, i.e., when A is 

30 absent, n is zero and X is a naturally-occuring amino acid, Y is either alanine, serine or 

threonine. Thus, whenever the extension is a dipeptide, it is X-Ala, X-Ser or X-Thr. The *958 
application teaches chemical, not enzymatic, cleavage of the dipeptide X-Pro. The dipeptide X- 
Ala, X-Ser and X-Thr are not susceptible to the type chemical cleavage taught by the '958 
application that cuts the X-Pro extension from the core protein. 

35 Australian Patent Application Document No. AU-A-12709/88 discloses fusion proteins 

which contain affinity peptides useful in immobilized metal affinity chromatography (IMAC). 
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Hie affinity peptides disclosed contained at least two neighboring histidine residues. The 
IMAC purification means disclosed requires a special synthetic chemistry for making nitrilo- 
triacetic acid (NTA) resins. 

Tallon, M.A., et al., Biochem. 26:7767-7774 (1987) relate to synthesis of extended 
5 analogs of the tridecapeptide a-factor from Saccharomyces cerevlsiae. The synthesized analogs 
are extended a-factors, which represent sequences of naturally occurring pro-a-factor coded for 
in the MFal structural gene. 

Kriel, G. et al, Eur. J. Biochem. 111:49-58 (1980) describes the stepwise cleavage of 
the N-terminal portion of melittin precursor (Promelittin) by dipeptidylpeptidaselV. 
10 Promelittin is the main constituent of honeybee venom. In the amino acid sequence of the N- 
terminal portion of the precursor, every second residue is either proline or alanine. When 
promellitin is exposed to DPP IV isolated from pig kidney, the N-terminal region of the 
precursor is cleaved in a stepwise fashion producing the mature protein. Promelittin, unlike 
fusion proteins according to the present invention, is a naturally-occurring protein. 
15 Julius, D., et al, Cell, Vol 32:839-852 (March 1983) relates to the role of membrane 

bound DPP IV in the processing of yeast a-factor from a larger precursor polypeptide. The 
yeast a-factor, unlike fusion proteins according to the present invention, is a naturally-occurring 
protein. 

Mollay, C. et al, Eur. J. Biochem. 160:31-35 (1986) describes the isolation of DPP IV 
20 from the skin secretion of Xenopus laevis. The activity of DPP IV is discussed. 

Mentlein, R., FEB, Vol. 234, No. 2, pp. 251-256 (July 1988) reviews proline residues 
in the maturation and degradation of peptide hormones and neuropeptides. It is reported that in 
mammals, proline specific proteases such as DPP IV are not involved in the biosynthesis of 
regulatory peptides but may play an important role in the degradation of them. Thus, it is 
25 concluded that while in vertebrates and lower vertebrates precursor proteins rely on DPP IV to 
convert precursors to mature forms, the processing of regulatory proteins in mammals generally 
uses DPP IV as a degradation protease. 

Frohman, L. A. et al. J. Clin. Invest. 78:906-913 (1986) report that human growth 
hormone releasing factor (hGRF) and its analogs are rapidly degenerated in vivo in humans 
30 and in vitro by plasma DPP IV. 

Frohman, L. A. et al. J. Clin. Invest. 83:1533-1540 (1989) report that human growth 
hormone releasing factor (hGRF) and its analogs are rapidly degenerated in vivo in humans and 
in vitro by plasma DPP IV. 

Kubiak, T.M., et al, Drug Metabolism and Disposition, Vol. 17, No. 4, pp. 393-397 
35 (1989) refer to the in vitro metabolic degradation of bovine growth hormone releasing factor 
(bGRF) analogs in bovine and porcine plasma and the correlation with plasma DPP IV activity. 



SOOCID: <WO 9210576A1_I_> 



WO 92/10576 PCT/US91/09152 

-6- 

The bGRF analogs tested had an Ala residue at position 2- of the N-terminus. It is reported 
that the metabolic degradation of bGRF in plasma is due to the presence of DPP IV in the 
plasma. 

Hong, W., et al, Biochemistry, 28:8474-8479 (1989) report the expression of 
5 enzymatically active DPP IV in Chinese hamster ovary cells after transfection. 

Kreil, G., TBS 15:23-26 (January 1990) reviews of the stepwise cleavage of dipeptides 
by DPP's in the conversion of precursors to final products. The precursors, described by Kreil 
are naturally-occurring proteins. The fusion proteins of the present invention are non-naturally- 
occurring fusion proteins. 
10 Boman, et al., J. Biol. Chem. 264:5852-5860 (1989) demonstrated that a dipeptidyl 

peptidase isolated from cecropia pupae (with similar specificity to DPP IV) was able to remove 
natural N-terminal sequences of Ala-Pro-Glu-Pro from the N-terminal of synthetic copies of the 
natural precursors of cecropia A and B. The preprocecropin disclosed by Boman is a naturally- 
occurring protein. 

15 Dalboge, H„ et al, Bio/technology, 5:161-164 (February 1987) disclose converting E. 

coli produced precursor of human growth hormone (hGH) to authentic hGH in vitro. The N- 
terminal extension of the precursor is removed by dipeptidypeptidase I. 

Dalboge, H M et al, FEBS, Vol. 246 (l,2):89-93 (March 1989) discloses the cloning and 
expression of IL-l/S precursor and its conversion to IL-ljS by removal of the precursor's N- 

20 terminal extension using dipeptidypeptidase I. 

Dalboge, H M et al, FEBS, Vol. 266 (1,2): 1-3 (June 1990) refer to in vivo processing of 
N-terminal methionine in E. coll It is reported that the removal of the N-tenninal methionine 
from extended human growth hormone was dependent upon the amino acid adjacent to the 
methionine. 

25 Hopp, T.P. et aL, Bio/Technol. 6:1204-1210 (October 1988), disclose addition of an 

eight amino acid peptide to the N-terminus of a desired recombinant lymphokine in order to 
provide an antigenic N-terminus which can be used in immunoaffinity purification. This 
publication corresponds to U.S. Patent No. 4,782,137 described above. 

Smith, M. C., et al., J. Biol. Chem. Vol. 263, 15:7211-7215 (1988) disclose 

30 experimental results supporting the hypothesis that specific metal chelating peptides on the NH 2 
terminus of a small peptide can be used to purify that protein using metal ion affinity 
chromatography. This reference provides specific data regarding one of the examples in the 
above described U.S. Patent Number 4,569,794. Specifically, the use of the metal chelating 
peptide His-Trp linked to either luteinizing hormone-releasing hormone or proinsulin allows the 
35 chimeric peptide to be purified using IMAC whereas control molecules not containing the His- 
Tip linker cannot be recovered in the like manner. 
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Hochuli, E. et al., J. Chromat. 411:177-184 (1987) disclosed a nitrilotriacetic acid 
absorbent useful for metal chelate affinity chromatography. It is reported that the disclosed 
absorbent when charged with Nr" 1 " is useful in binding to peptides and proteins containing 
neighboring histidine residues. 
5 Ljungquist, C. et al., Eur. J. Biochem. 186:563-569 (1989) disclose the use of the 

metal chelating peptide Ala-His-Gly-His-Arg-Pro in multiplicities of two, four and eight 
together with a column containing immobilized Zn 2+ ions. According to Ljungquist use of this 
metal chelating peptide with zinc columns provides unexpectedly good purification of the fusion 
proteins. 

10 DETAILED DESCRIPTION OF THE INVENTION 

As used herein, the terms "non-naturally occurring fusion protein", "non-naturally 
occurring fusion polypeptide", "fusion polypeptides" and "fusion proteins" refer interchange- 
ably to proteins and polypeptides which do not normally occur in nature and which comprise a 
core protein portion and an extension portion. 
15 As used herein "core protein", "core protein portion" and "polypeptide portion" refers 

to the portion of a fusion polypeptide which is located at the C-terminus end of the molecule 
and which, absent the extension portion, would be a desired polypeptide and/or a biologically 
active protein including naturally occuring biologically active proteins and polypeptides and 
analogs and mutants thereof. 
20 As used herein "N-terminal extension" refers to the first up to about 45 amino acids 

starting at the N-terminus and which are not part of the core protein. 

As used herein "prodrugs" refers to fusion proteins wherein the biologically desired 
portion is a biologically active protein useful as a drug. 

As used herein "biologically active protein" and "biologically active polypeptides" refer 
25 to interchangeable proteins and polypeptides which possess biological activity. 

As used herein "desired protein" and "desirable protein" refer interchangeable to 
proteins and polypeptides which are sought in pure form. 

As used herein "extension portion" refers to the portion of a fusion protein which is an 
N-terminal extension and which is not part of the biologically desired portion. 
30 As used herein "DPP IV cleavable N-terminal extension portion" refers to the extension 

portion of a fusion protein which has an amino acid sequence that can be removed by the 
stepwise cleavage by DPP IV. 

In the Sequence Listing Section, some amino acid residues have been designated Xaa in 
Seq ID. The following descriptions apply: 
35 In Seq ID No. 3 Xaa 29 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 4 Xaa 29 represents C-terminally amidated Argininyl residue. 
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In Seq ID No. 5 Xaa 29 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 14 Xaa 29 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 18 Xaa 31 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 19 Xaa 33 represents C-terminally amidated Argininyl residue. 
5 In Seq ID No. 20 Xaa 39 represents C-terminally amidated Argininyl residue. 

la Seq ID No. 21 Xaa 45 represents C-ierminally amidated Argininyl residue. 

In Seq ID No. 24 Xaa 27 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 25 Xaa 31 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 26 Xaa 33 represents C-terminally amidated Argininyl residue. 
10 In Seq ID No. 27 Xaa 35 represents C-tenninally amidated Argininyl residue. 

In Seq ID No. 28 Xaa 37 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 29 Xaa 33 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 30 Xaa 35 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 31 Xaa 37 represents C-terminally amidated Argininyl residue. 
15 In Seq ID No. 32 Xaa 39 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 33 Xaa 45 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 34 Xaa 43 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 35 Xaa 45 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 36 Xaa 31 represents C-terminally amidated Argininyl residue. 
20 In Seq ID No. 37 Xaa 31 represents C-tenninally amidated Argininyl residue. 

In Seq ID No. 38 Xaa 31 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 39 Xaa 31 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 40 Xaa 31 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 41 Xaa 33 represents C-terminally amidated Argininyl residue. 
25 In Seq ID No. 42 Xaa 31 represents C-terminally amidated Argininyl residue. 

In Seq ID No. 43 Xaa 33 represents C-terminally amidated Argininyl residue. 

The present invention relates to improved proteins and polypeptides. According to the 
present invention, biologically active polypeptides are first produced as fusion proteins which 
contain the two portions: a first portion which represents the core protein portion; and, a 
30 second portion which is an N-terminal extension portion that is covaiently linked at its carboxy 
terminus to the amino terminus of the first portion. The N-terminal extension portion of the 
fusion polypeptide possesses an amino acid sequence which renders it subject to cleavage by the 
dipeptidylpeptidaselV (DPP IV). 

A fusion protein according to the present invention has the formula: 
35 Extension portion - Core protein portion 

wherein "Extension portion" represents a DPP IV cleavable N-terminal extension; " - " 
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rep^ents a covalent p^tide bond; ^^^re^pTotein^o^idn^'' represents«any dfssired peptide 
which is liberated from the Extension portion by DPP IV processing. 

SjF* The Extension portion of a fusion protein according to the present invention has an 
amino acid sequence according to the formula: 
5 A-X-Y(X'-Y) n 

: wherein X? is optional, and when present is methionine; 

n represents the number of sequentially linked X'-Y groups, that number representing 
from 0 to 20 of such groups, preferably 0 to 10 groups. 

X is selected from the group consisting of any naturally occuring amino acid; 
10 Y is selected from the group consisting of proline, alanine, serine, and threonine, 

except when n = 0, then Y is selected from the group consisting of alanine, serine, and 
threonine; 

X' is selected from the group consisting of any naturally occuring amino acid except 
proline or hydroxyproline; 

15 According to the formula, when n = 1, there are two Y residues. Further, it is 

possible to have up to twenty one Y residues and twenty X' residues in a single embodiment. 
Individual Y residues and X' residues respectively can be any residue of the group from which 
they are selected. That is, all of the individual Y residues do not have to be the same in a 
given embodiment. Similarly, in an embodiment with more than one X' residue, each 

20 individual X' residue present can be any amino acid residue except proline and hydroxyproline 
irrespective of what residue any other X' residue may be. Each individual Y and X' residue 
respectively must conform to the rules for that particular group and all that is necessary is that 
the various individual residues at the specific positions follow the rules as articulated above. 

Fusion proteins in which (A) is present as methionine (Met) represent sequences useful 

25 for the production of biologically active proteins by recombinant DNA methods in E. colL The 
Met sequence present in these precursors usually will be processed by the E. coli enzymatic 
system or some other means which can be performed by a person with ordinary skill in the art. 
Protein synthesis in E. coli is, under normal circumstances, initiated at the translation initiation 
codon AUG coding for Met. As a consequence, the newly synthesized polypeptides have a 

30 methionine residue as their N-terminal amino acid. E. coli possesses an enzymatic activity with 
the capacity to effectively remove N-terminal Met when the Met N-terminal residue is adjacent 
to an amino acid with a relatively small side chain like Gly, Ala or Ser as well as Pro. Highly 
specific removal of the N-terminal Met can be accomplished using cyanogen bromide mediated 
cleavage of Met. However, for that procedure to be successful, the N-terminal Met must be 

35 the only Met in the entire protein sequence; otherwise the cleavage will take place after each 
Met in the sequence. Accordingly, for fusion proteins containing internal Met sequences, the 
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second amino acid from the N-terminus must be Pro, Gly, Ala or Ser if the Met is to be 
removed by the E. coli enzymatic system. 

In addition to fusion polypeptides, the present invention relates to: recombinant DNA 
molecules which comprise DNA sequences that encode the fusion polypeptides; methods of 
5 using the recombinant DNA molecules; methods of using the fusion polypeptides including 
methods of purifying desired polypeptides and methods of delivering drugs which comprise 
administering prodrugs that are converted from precursor to biologically active forms by 
stepwise proteolytic removal of the N-terminal extension in vivo. 

Production of fusion polypeptides can be accomplished using standard peptide synthesis 
10 or recombinant DNA techniques both well known to those having ordinary skill in the art. 
Peptide synthesis is the preferred method of making polypeptides which comprise about SO 
amino acids or less. For larger molecules, production in host cells using recombinant DNA 
technology is preferred. 

Fusion polypeptides which contain N-terminal portions that are recognized and cleaved 
IS by DPP IV are useful and advantageous over unmodified polypeptides comprising only the core 
protein portion. Hie present invention describes two areas of particular utility. The first use is 
to provide fusion polypeptides, termed "prodrugs", which comprise biologically active 
polypeptides that are useful as drugs covalently linked to DPP IV cleavable N-terminal 
extensions. These proforms can be converted into biologically active forms upon cleavage by 
20 DPP IV in the body of a human or other animal that has been administered the prodrug. 
Accordingly, the present invention relates to fusion polypeptides useful as prodrugs, use of 
fusion polypeptides in a medicinal preparation and to a method of delivering biologically active 
polypeptides to a patient. A second use for fusion polypeptides according to the present 
invention is in protein purification processes in which the N-terminal extension is the 
25 component of the polypeptide which renders it effective in a purification method and which N- 
terminal extension is then removable by cleavage using DPP IV. Accordingly, the present 
invention relates to fusion polypeptides useful in purification procedures and to a method of 
purifying desired polypeptides. These uses serve as examples to illustrate the utility of the 
present invention and are not meant to limit the invention in any way. 
30 For both uses, the core protein portion of the fusion protein is liberated from the 

extension portion by DPP IV activity. In the case of fusion proteins used in purification 
methods, it is undesirable that the core proteins be substrates for DPP IV cleavage. That is, it 
is preferred that DPP IV not be able to cleave the core protein after the extension portion has 
been removed. It is often most desirable that when a core protein is a DPP IV substrate, it is 
35 delivered as a prodrug. In such cases, the prodrug can result in sustained presence of the core 
protein since some of the DPP IV found in vivo (e.g. in plasma, kidney tissue and liver tissue) 
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will be used to process N-terminal extensions and, therefore, delay core protein degradation. 
That is, the extension portion of the fusion protein can act as a substrate for DPP IV and 
competitive inhibitor, delaying the DPP IV action on the core protein thereby temporarily 
protecting the core protein. 
5 As used herein, "prodrug" means a fusion protein which contains a DPP IV cleavable 

N-terminal extension covalently linked to a core protein portion that is a biologically active 
polypeptide useful as a drug. Prodrugs according to the present invention can be administered 
as an individual preform or in combination with other compounds. The preferred embodiment 
is a well defined individual form of a prodrug. In either case, the preforms are processed by 

10 naturally occurring DPP IV normally found in the body. 

The advantage of administering a prodrug in a medicinal preparation is that it delays 
activity and/or provides for extended presence of the biologically active protein. Prodrugs can 
remain active longer than unmodified molecules. Prodrugs can exist in a non-active state until 
such time elapses that a sufficient portion of the extension portion is degraded and the molecule 

15 becomes active. Prodrugs, therefore, can act as a time delayed drug delivery system. 

Furthermore, different N-terminal extensions are degraded at different rates, depending on their 
length and the specific residues present in their amino acid sequence. Combinations of different 
forms of prodrugs having a variety of N-terminal extensions can be provided which can provide 
a sustained, steady level of active drug in a patient over a course of time. Prodrugs, therefore, 

20 can act as a time delayed drug delivery system. 

As described above, DPP IV cleaves off a dipeptide from the N-terminus of a 
polypeptide provided certain residues occupy certain positions. As used herein, "position one" 
refers to the amino acid residue position at the N-terminus. As used herein, "position two" 
refers to the amino acid residue position which is immediately adjacent to position one and 

25 which is second from the N-terminus. As used herein, "position three" refers to the amino acid 
residue position which is immediately adjacent to position two and which is third from the N- 
terminus. The cleavage which will remove the N-terminal dipeptide occurs between position 
two and position three provided amino acid three is not proline or hydroxyproline and amino 
acid two is one of five amino acids: proline (Pro), hydroxyproline (Hyp), alanine (Ala), serine 

30 (Ser), or threonine (Thr). 

DPP IV cleaves the N-terminal residues at a different rate depending upon which of the 
four amino acid residues is present at position two. In most cases, DPP IV cleaves most 
efficiently when position two is occupied by Pro and it is next most efficient when position two 
is occupied by Ala. When position one is occupied by tyrosine, phenylalanine or histidine, 

35 DPP IV works at about the same rate when position two is occupied by Pro or Ala. DPP IV is 
next most efficient when position two is occupied by Ser. It is least efficient when Thr 
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occupies the second position. 

Using this information, a variety of N-terminal extensions can be designed which are 
processed at different rates. Thus, a medicament can be administered which comprises either a 
specific prodrug or a combination of prodrug forms. The prodrugs, bearing an assortment of 
5 N-terminal extensions, will each be processed at a rate which is dependent upon their amino 
acid sequences. This combination of prodrug forms can be formulated to comprise a series of 
prodrugs that are processed into active polypeptides across a spectrum of time. 

The length and amino acid sequence residue makeup are controlling factors in the rate 
of DPP IV cleavage. Extensions containing all or mostly all alternating Y=Pro will be 

10 processed the fastest while those containing Y=Thr will be converted the slowest. In addition 
it is known that dipeptidyl units X-Pro where X is either Glu or Asp are cleaved much slower 
than their counterparts where X is a neutral or basic amino acid residue. Since extensions can 
comprise different residues (of the four) at each cleavage position in the extension, an 
extremely high number of variations and permutations can exist. 

15 Any biologically active polypeptide can be used as a polypeptide drug. PCT patent 

application number PCT/US90/02923, incorporated herein by reference, PCT patent application 
number PCI7US9 1/08248, incorporated herein by reference, and U.S. Patent Application Serial 
Number 07/368,231, incorporated herein by reference and each disclose bovine growth 
hormone releasing factor analogs which can be used in a medicinal preparation as a prodrug 

20 according to the present invention. Any of the analogs taught in these applications can be used 
as a core peptide portion of a fusion protein according to the present invention. Fusion proteins 
comprising such core protein portions linked to extension portions may be produced by those 
having ordinary skill in the art using well known methods. 

Other examples of embodiments of the present invention include hormones, receptors, 

25 enzymes, storage proteins and blood proteins. Specific examples include: Vasoactive Intestinal 
Peptide (VIP); human 0-casomorphin; Gastric Inhibitory Peptide (GIP); Gastric Releasing 
Peptide (GRP); human Peptide HI; human Peptide YY; fragment 7-37 of glucagon-Iike peptide- 
1; glucagon-like peptide-2; substance P; Neuropeptide Y; human Pancreatic Polypeptide; 
insulin-like growth factor-1 (IGF-1); human growth hormone (hGH); bovine growth hormone 

30 (bGH); porcine growth hormone (pGH); prolactin (PRL); human, bovine, porcine or ovine 
growth hormone releasing factor (GRF); interleukin-lj? (IL-ljS); EGF; IGF-2; glucagon; 
corticotropin releasing factor (CRF); dynorfin; somatostatin-14; endothelin; transforming 
growth factor a (TGF-a); transforming growth factor 0 (TGF-/3); interleukin-4; interleukin-6; 
nerve growth factor (NGF); tumor necrosis factor (TNF); insulin; fibroblast growth factor 

35 (FGF); interferon; CD4; and interleukin-2 (1L-2) or their synthetic or biosynthetic analogs. 
These polypeptides can also be used to form the core protein portion of fusion proteins 
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according to the present invention. These polypeptides are meant only to serve as examples of 
embodiments and are not meant to limit the scope of the present invention. 

Smaller fusion proteins according to the present invention can be synthesized, for 
example, by solid-phase methodology utilizing an Applied Biosystems 430A peptide synthesizer 
5 (Applied Biosystems, Foster City, California) as described in detail in PCT/US90/Q2923 and 
07/368,231. 

For larger molecules, production in host cells using recombinant DNA is preferred. 
There are several different methods available to one having ordinary skill in the art who wishes 
to use recombinant DNA technology to produce fusion proteins. Typically, genes encoding 

10 desired polypeptides are inserted in expression vectors which are then used to transform or 
transfect suitable host cells. The inserted gene is then expressed in the host cell and the desired 
polypeptide is produced. To produce the fusion polypeptides of the present invention in a like 
manner, an additional DNA sequence is included in the gene insert. Specifically, DNA 
encoding the N-terminal extension residues is operably linked to the 5' end of the gene 

IS encoding the desired polypeptide. This additional genetic material must be placed downstream 
from the promoter of the expression vector so that it is under the control of the promoter. 
Additionally, it must be placed in proper reading frame with the gene so that the expressed 
protein product includes the N-terminal extension residues covalently linked to the desired 
polypeptide. 

20 Therefore, to produce fusion proteins according to the present invention using 

recombinant DNA technology, oligonucleotides must be designed which encode the amino acid 
sequence of the desired N-terminal extension and these oligonucleotides must be operably 
inserted upstream of the 5' end of the gene encoding the core protein portion, generating a 
chimeric gene. The techniques to make oligonucleotides and the techniques used to producing a 

25 chimeric gene are well known to those having ordinary skill in the art. 

In addition to the utility of fusion proteins as prodrugs, the present invention relates to 
the purification and processing of biologically active recombinant polypeptides. The desired 
biologically active recombinant polypeptides are most preferably produced in a soluble form or 
secreted from the host. According to the present invention, the extension portion of the fusion 

30 protein can be recognized by purification means. Hie fusion protein is purified from the 
material present in the secretion media or extraction solution it is contained in and then 
processed to remove the extension portion from the core protein portion, thus producing 
purified desired protein. Accordingly, desired proteins most suited for processing as fusion 
proteins according to the present invention are those biologically active polypeptides which are 

35 not themselves substrates for DPP IV cleavage. 

In accordance with the present invention, a gene sequence encoding for a desired 
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protein is isolated., synthesized or otherwise obtained and operably linked to a DNA sequence 
coding an extension portion. The hybrid gene containing the gene for a desired protein 
operably linked to a DNA sequence encoding an extension portion is referred to as a chimeric 
gene. 

S Methods and materials for preparing chimeric genes and recombinant vectors, 

transforming or transfecting host cells using the same, replicating the vectors in host cells and 
expressing biologically active foreign polypeptides and proteins are described in Principles of 
Gene Manipulation, by Old and Primrose, 2nd edition, 1981 and Sambrook et al., Molecular 
Cloning, 2nd Edition, Cold Spring Harbor Laboratory Press, NY (1989), both incorporated 

10 herein by reference. 

The present invention relates to recombinant chimeric genes which encode fusion 
proteins, expression vectors containing the same, hosts transformed or transfected with these 
expression vectors, and process for obtaining these genes, expression vectors, and hosts 
transformed or transfected with said vectors. 

15 The present invention may be used to purify any prokaryotic or eukaroytic protein that 

can be expressed as the product of recombinant DNA technology in a transformed or 
transfected host cell. These recombinant protein products include hormones, receptors, 
enzymes, storage proteins, blood proteins, mutant proteins produced by protein engineering 
techniques, or synthetic proteins. Hie desired polypeptides produced may include HIV RNase 

20 H, tPA, IL-1, IL-1 receptor, CD4, human nerve growth factor, sCD4-PE40, human respiratory 
syncytial virus (RSV) FG chimeric glycoprotein (See U.S. Patent Application Serial No. 
07/543,780, incorporated herein by reference), EGF, IGF-1, IGF-2, glucagon, corticotropin 
releasing factor (CRF), dynorfin, endothelin, transforming growth factor a (TGF-of), 
Pseudomonus endotoxin 40 (PE40), transforming growth factor-0 (TGF-0), insulin and analogs 

25 thereof. 

Examples of purification means include IMAC and immunoaffinity. Other purification 
means which employ the use of extension peptides that can be removed using DPP IV are 
within the contemplated scope of the present invention. 

In one embodiment of the present invention, fusion proteins comprising a biologically 
30 active polypeptide portion and an extension portion which is a metal chelating peptide are useful 
in an immobilized metal affinity chromatography system. 

Immobilized Metal Ion Affinity Chromatography (IMAC) for fractionating proteins was 
first disclosed by Porath, J. et al., Nature 258:598-599 (1975). Porath disclosed derivatizing a 
resin with iminodiacetic acid (IDA) and chelating metal ions to the IDA-derivatized resin. 
35 Porath disclosed proteins could be immobilized in a column which contained immobilized metal 
ions. It involves attaching a commonly used iminodiacetic acid (IDA) to a matrix followed by 
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chelating a metal ion to the IDA-containing resin. The proteins bind to the metal ion(s) through 
functional groups of amino acid residues capable of donating electrons. Potential electron 
donating amino acid residues are cysteine, histidines, and tryptophan. Proteins interact with 
metal ions through one or more of these amino acids with electron donating side chains. 
5 Smith et al. discloses in U.S. Patent No. 4,569,794, incorporated by reference herein, 

that certain amino acids residues are responsible for the binding of the protein to the 
immobilized metal ions. However, the bound protein can be eluted by lowering the pH or 
using competitive counter ligands such as imidazole if histidine side chains are involved in the 
binding. Histidine-containing di- or tripeptides in proteins have been used to show that IMAC 
10 is a selective purification technique. Accordingly, Smith et al. discloses using recombinant 
DNA techniques to produce a fusion protein comprising a metal chelating peptide covalently 
liked to a desired polypeptide. The metal chelating peptide is an extension portion that is 
effectively a handle to the desired polypeptide. This handle can be used in protein purification. 
Use of IMAC technology with metal chelating peptides having alternating His residues 
15 is disclosed in U.S. Patent Application Serial No. 07/506,605, which is incorporated herein by 
reference. U.S. patent application Serial No. 07/506,605 discloses specific metal chelating 
peptides which provide unexpectedly superior results in the IMAC purification of a fusion 
protein when the metal chelating peptide comprises three to six alternating His residues. 
Following the teachings of U.S. patent application Serial No. 07/506,605 and U.S. Patent No. 
20 4,569,794, it is possible to employ the commonly used IDA resin in IMAC for the purification 
of fusion proteins having a metal chelating peptide portion with at least three alternating 
histidine residues which are constituents of DPP IV-recognized sequences. Construction of 
fusion.proteins and their use in an IMAC system is taught by U.S. Patent No. 4,569,794. 
Construction and use of a metal chelating peptide portion comprising alternating His residues is 
25 taught in U.S. Patent Application Serial No. 07/506,605. By providing a fusion protein with 
DPP IV recognizable residues between alternating His residues the present invention provides a 
fusion protein which can be purified using IMAC technology and subsequently processed with 
DPP IV to yield a desired polypeptide. 

According to this embodiment of the present invention, the extension portion is a metal 
30 chelating peptide which can be represented by the formula: 
A-X-Y(X'-Y) n 

and further, wherein A is optional, and when present is methionine; 
n represents the number of sequentially linked X'-Y groups, that number representing 
from 0 to 20 of such groups, preferably 0 to 10 groups. 
35 X is selected from the group consisting of any naturally occuring amino acid; 

Y is selected from the group consisting of proline, alanine, serine, and threonine, 
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except when n = 0, then Y is selected from the group consisting of alanine, serine, and 
threonine; 

X' is selected from the group consisting of any naturally occuring amino acid except 
proline or hydroxyproline; 
5 wherein at least two to three residues designated X' and, optionally, X are Histidine 

(His). Preferably, Y is Pro and n is at least 3. When treated with DPP IV, the N-terminal 
extension is cleaved in a stepwise fashion, producing the biologically active polypeptide 
provided the biologically active polypeptide is not itself a DPP IV substrate. 

One example of a fusion protein includes an extension portion having the formula 
10 His-Y(His-Y) n 

wherein n=3 to 8, and Ys are Pro, Hyp, Ala, Ser or Thr, Pro being the most 
preferred. Another example of a fusion protein includes an Extension portion having the 
formula 

X-Y(His-Y) n 

15 wherein n is 3 to 8, X is any naturally occurring amino acid; and Ys are Pro, Hyp, 

Ala, Ser or Thr, Pro being the most preferred. 

Since N-terminal Met is a consequence of protein synthesis in Exoli and it is known to 
be processed by the E. coli enzymatic system when the adjacent amino acid is Pro, Gly, Ala or 
Ser, the following extensions represent sequences useful for the IMAC purification and 
20 cleavage of biologically active peptides or proteins expressed intracellularly in £. coli by 
recombinant DNA techniques. 
Met-X-Y(His-Y) n 

wherein n » 3 to 8, X is Pro, Gly, Ser, or Ala; and Ys are Pro, Hyp, Ala, Ser, or 
Thr, Pro being the most preferred. 
25 In another example, if the peptide or protein desired is to be secreted from a given host 

after transformation or transfection then the vectors could be designed so as to secrete the 
protein or polypeptide using an extension portion which facilitate transport, such as: 
X-Y-(His-Y) n 

wherein n = 3 to 8, X could be any naturally occurring amino acid compatible with the 
30 secretion system from a given host and Ys are Pro, Hyp, Ala, Ser, or Thr. 

Another protein purification system which uses fusion proteins and which is well suited 
for DPP IV processing technology is immunoaffinity purification. U.S. Patent No. 4,782,137 
issued November 1, 1988 to Hopp et al., incorporated herein by reference, discloses the 
synthesis of a fusion protein having a highly antigenic N-terminal portion and a desired 
35 polypeptide at the C-terminal portion. According to Hopp et al., the fusion proteins are 
purified from crude supernatent by passing crude supernatent through a column containing 
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immobilized antibodies which recognize the antigenic portion of the fusion protein. The 
immobilized antibodies keep the protein in the column while the undesired components of the 
supernatant are eluted. The column conditions can then be changed to cause the antigen- 
antibody complex to dissociate. 

5 According to the present invention, the highly antigenic N-terminal portion of the 

fusion protein is an extension portion which contains DPP IV recognizable residues. After 
collection as described in the Hopp patent, the fusion protein according to the present invention 
can be exposed to DPP IV, thereby removing the extension portion. One of ordinary skill in 
the art could practice the immunoaffinity purification system of Hopp with N-terminal 

10 extensions according to the present invention. 

The embodiments and examples described herein serve to illustrate the nature of the 
present invention and are not meant to limit the scope of the invention. Contemplated 
equivalents include fusion proteins which have N-terminal extensions which can be processed 
by at least one other means such that removal of the extension is due to a combination of 

15 means. Contemplated equivalents also include fusion polypeptides comprising chemically 
modified amino acid residues. 
EXAMPLES 

Example 1 Synthetic Prodrugs which are Fusion Prodrugs Having Core Proteins that are 
DPP IV Substrates 

20 Fusion polypeptides that can be synthesized and administered as prodrugs have a DPP 

IV degradable N-terminal extension covalendy linked to the N-terminal of the biologically 
active polypeptide. The formula for these prodrugs can be expressed as the formula: 

extension portion - core protein drug portion 
wherein "extension portion" represents a DPP IV cleavable N-terminal extension; " - " 

25 represents a covalent peptide bond; and, "core protein portion" represents any desired peptid 
which is liberated from the extension portion by DPP IV processing. In this example the core 
protein of the fusion protein is a potential substrate for DPP IV following removal of the 
extension portion. 

Synthetic prodrugs can be produced using peptide synthesis techniques well known in 

30 the art. 

In one embodiment, the core protein portion is epidermal growth factor (EGF) and the 
extension portion is Gly-Pro-Phe-Ala: 

Gly^-Pro^-Phe^-Ala^-fEGF]. 
In another embodiment, the core protein portion is glucagon and the extension portion 

35 is Ala-Pro-Phe-Ala: 

Ala^-Pro^-Phe-^Ala-MGLUCAGON]. 
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In another embodiment, the core protein portion is [Ala 15 Leu 27 ]-bGRF (1-29)NH2 
(Seq ID 3) and the extension portion is Tyr-Ala: 

Tyr- 2 -Al a -M[Ala 15 Leu 27 ]-bGRF (1-29)NH 2 }. 
Example 2 Synthetic Prodrugs Which are Fusion Proteins Having Core Proteins that are 
5 not DPP IV Substrates 

Fusion polypeptides that can be synthesized and administered as prodrugs have a DPP 
IV degradable N-terminal extension covalently linked to die N-terminal of the biologically 
active polypeptide. The formula for these prodrugs can be expressed as the formula: 
extension portion - core protein drug portion 
10 wherein "extension portion" represents a DPP IV cleavable N-terminal extension; " - " 

represents a covalent peptide bond; and, "core protein portion" represents any desired peptide 
which is liberated from the extension portion by DPP IV processing. 

Synthetic prodrugs can be produced using peptide synthesis techniques well known in 

the art. 

15 In one embodiment, the core protein portion is a bGRF analog, [Val 2 Ser 8 ' 28 ,Leu 27 ]- 

bGRF (l-33)OH (Seq ID 1), and the extension portion is Gly-Pro-Tyr-Ala: 

GIy^-Pro- 3 -Tyr 2 -Ala- 1 -{[Val 2 Ser 8 » 28 Ala 15 Leu 27 ]-bGRF(l-33)OH}. 
In another embodiment, the core protein portion is corticotropin releasing factor (CRF) 
and the extension portion is Gly-Pro-Phe-AIa: 
20 Gly^-Pro^-Phe^-Ala^-rCRF]. 

In another embodiment, the core protein portion is dynorfin and the extension portion is 
Phe-Pro-Phe-Ala: 

Phe^-Pro^-Phe^-Ala^-fDYNORFIN]. 
In another embodiment, the core protein portion is somatostatin-28 and the extension 
25 portion is Gly-Pro-Phe-Pro: 

Gly^-Pro^-Phe^-Pro^-tSOMATOSTATIN^S]. 
In another embodiment, the core protein portion is endothelin and the extension portion 
is Ala-Pro-Phe-Ala: 

Ala^-Pro^-Phe^-Ala^-fENDOTHELIN]. 
30 In another embodiment, the core protein portion is a bGRF analog flle?Ser* ,28 Ala 15 

Leu^J-bGRF (1-40)OH (Seq ID 2) and the extension portion is Phe-Ala: 
Phe" 2 -Ala"Mpie 2 Ser 8 ' 28 Ala 15 Leu 27 ]-bGRF (1-40)OH}. 
In another embodiment, the core protein portion is [Ile^a^Leu^-bGRF (1-29) NH 2 
(Seq ID 4) and the extension portion is Tyr-Ser: 
35 Tyr^-Ser-Mtlle 2 Ala 15 Leu 27 ]-bGRF (1-29)NH 2 }. 

Example 3 Sustained Presence of bGRF Analog Leu 27 -bGRF (1-29)NH 2 
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A bGRF analog, Leu 27 -bGRF (1-29)NH 2 , its sequence shown as Seq ID 5, can be 
administered as a medicament comprising the core protein shown in Seq ID 5 and a variety of 
N-terminally extended prodrugs. 

Several versions of prodrugs can be made by well known methods using Seq ID 5 as 
5 the core protein portion. Extension portions for these Seq ID S-based prodrugs are lie-Ala, 
Gly-Pro-Ile-Pro, Seq ID 6, Seq ID 7, Tyr-Ala, Gly-Pro-Tyr-Ala, Seq ID 8, Seq ID 9, Seq ID 
10, Seq ID 11, Seq ID 12, Seq ID 13, Tyr-Ala-Tyr-Ala and Val-Ala. 
Example 4 Sustained Presence of bGRF Analog [ThAUa^Leu^J-bGRF (1-29)NH 2 

A bGRF analog, [Thr^Ala^Leu^l-bGRF (1-29)NH 2> its sequences shown as Seq ID 
10 14, can be administered as a medicament comprising the core protein portion shown in Seq ID 
14 and a variety of N-terminally extended prodrugs. Three versions of the prodrug were made 
having extension portions of Tyr-Thr, Tyr-Ser, and Tyr-Thr-Tyr-Thr, respectively. 
Example 5 Fusion Proteins which Contain HIV RNase H and N-Terminal Extensions 

A strategy to purify chimeric proteins from recombinant E. coli is described based on 
IS metal chelating peptide domains containing alternate histidines, with affinity for an immobilized 
metal ion. Vectors are constructed to direct the synthesis of fusion proteins using HIV RNase 
H as the core protein. As shown below, these fusion proteins are designed to possess 
alternating histidines for purification by immobilized metal ion affinity chromatography (IMAC) 
and alternating prolines or alternating alanines for DPP IV cleavage to remove the metal 
20 chelating peptide (mcp). 

The preferred DPP IV cleavable N-terminal extensions according to the present 
invention are outlined as follows: 

Fusion protein HIVRH/mcp #1 comprises an N-terminal extension of Seq ID 15 linked 
to HIV RNase H: 
25 Met- n -Pro- 10 -Ala- 9 -His^ 
RNase H] 

Fusion protein HIVRH/mcp #2 comprises an N-terminal extension of Seq ID 16 linked 
to HIV RNase H: 

Met- n -Ala- 10 -Pro- 9 -His- 8 -AI^^ 

30 RNase H] 

Fusion protein HIVRH/mcp #3 comprises an N-terminal extension of Seq ID 17 linked 
to HIV RNase H: 

Mer u -Gly- 10 -Pro- 9 -His-^ 

RNase H] 

35 These fusion proteins are cloned and expressed in E. coli 9 and are purified using DEAE 

chromatography and RP-HPLC. N-terminal sequencing is used to characterize the fusion 
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proteins. Application of the alternating histidine-containing fusion proteins to the purification 
of recombinant proteins by IMAC and subsequent removal of the N-terminal extension by DPP 
IV confirm the utility of the present invention. 
Construction of Chimeric Genes Containing HIV RNase H Gene 
5 All recombinant DNAs are prepared by standard techniques. Oligonucleotides cor- 

responding to the metal chelating peptide/cleavage sequence are constructed, purified, annealed 
and ligated to a gene encoding HIV RNase H to form a chimeric gene. 

To prepare expression vectors encoding alternate histidines/DPP IV recognized cleavage 
residues/HIV-RNaseH, a chimeric gene is inserted into the final expression vector. Expression 
10 vectors containing the chimeric gene constructs are used to transform E. coli by standard 
techniques. Expression of the genes in E. coli results in the production of the fusion proteins 
encoded by the chimeric genes. These fusion proteins contain HIV RNase H amino acids and 
an N-tenninal extension which contains alternate histidines (metal chelating peptide) and 
alternate prolines or alanines. 
15 Preparation of Crude E. coli Extracts and Isolation of Fusion Proteins for Sequencing 

Approximately 3 g of E. coli cell paste is suspended in 30 ml of 0.25 M potassium 
phosphate, pH 7.2 containing 1 mM dithiothreitol (DTT), EDTA, phenylmethylsulfonyl 
fluoride (PMSF), and benzamidine HCL, 10 mg/liter aprotinin, leupeptin, and bestatin. Uiis 
suspension is passed through a French Press three times to break the cells. Cell lysates are 
20 centrifuged at 12,000 rpm for 1 hr. The supernatant is removed and solid ammonium sulfate 
added to 70% saturation. After stirring for 1 hr, the suspension is centrifuged at 12,000 rpm 
for 1 hr. The supernatant is discarded and the pellet is redissolved in 2.25 mis of 50 mM Tris 
pH 7.5 containing 1 mM DTT, PMSF, and benzamidine. The solution is then dialyzed 
overnight in 20 mM Tris, 50 mM NaCI, 1 mM DTT, 10% glycerol, and 0.1 mM EDTA pH 
25 7.5 (Buffer A) at 4°C. The dialysate is collected, diluted with one volume of Buffer A, and 
applied to a 10 ml column of washed DEAE cellulose equilibrated in Buffer A. The run 
through is collected batchwise and the column further washed with 50 mis of Buffer A. These 
solutions are collected, pooled, and concentrated by 70% ammonium sulfate precipitation and 
resuspended in 2 mis of Buffer A and dialyzed as described above. Concentrated RH is used 
30 for characterization by N-terminal sequence analysis. 
Purification of Fusion Proteins Using IMAC 

The feasibility of using a metal chelating peptide for the purification of recombinant 
proteins from crude extracts can be demonstrated by using the following chimerics expressed in 
recombinant E. coli with HIV RNase H as the model protein. Fusion proteins HIV RNase 
35 H/mcp #1, HIV RNase H/mcp #2 and HIV RNase H/mcp #3 are each purified. 

IMAC columns are prepared as follows. Chelating Sepharose Fast Flow from 
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Phannacia is washed thoroughly with Milli-Q water on a scintered glass filter. The gel is then 
resuspended in water to form a slurry. The slurry is poured carefully into a glass column 
(Pharmacia) to a volume of 6 mis (1 x 7 cm). After the gel has settled, the column is washed 
with 5 volumes of SO mM EDTA (ethylenediaminetetraacetic acid) pH 8.0. Following this, the 
5 column is washed with 5 volumes of 0.2 N NaOH and 5 volumes of Milli-Q water. The 
column then is charged with 5 volumes of 50 mM NiSo 4 (or ZnCl 2 or G1SO4). Finally, the 
column is washed with 5 bed volumes of equilibration buffer. The equilibration buffer is made 
up of 20 mM Tris pH 8.0, containing 500 mM NaCi, 1 mM PMSF, 1 mM benzamidine, 10 
mg/L leupeptin, and 10 mg/L aprotinin. 

10 The column has been equilibrated with at least 5 volumes of equilibration buffer. 5-10 

mis of crude recombinant £. coli extract are applied to the column by gravity. After all the 
crude material has entered the column, the column is washed with 10 column volumes of 
equilibration buffer containing 1.0 M NaCi, instead of 500 mM NaCi, pH 8.0. 

The column is then eluted with increasing concentrations of imidazole in the 

15 equilibration buffer at pH 8.0. For the earlier experiments, a large number of eiutions are 
performed for each experiment to determine the concentration at which the chimeric eluted. 
Later this elution is simplified and usually just three imidazole concentrations are used: 35 
mM, 100 mM, and 300 mM imidazole in the equilibration buffer, pH 8.0. Ten bed volumes of 
each imidazole buffer are used. Between eiutions, the column is washed with 10 volumes of 

20 equilibration buffer. Finally, the column is stripped with 5 bed volumes, of 50 mM EDTA pH 
8.0 to determine if any protein is still bound to the column. The flow rates for the columns are 
1.0 ml/min. 5 ml fractions are collected. The columns are run at room temperature. 

Commercially available Pierce protein assay kits are used to determine the protein 
content of the samples. 

25 HIV RNase H activity is determined by the method described in Becerra, S. P. et al, 

FEBS 270(l,2):76-80 (September 1990), incorporated herein by reference. 
Conversion of the N-terminal extended fusion proteins to mature proteins 

Commercially available DPP IV purified from human placenta (Enzyme Systems 
Products, Dublin, Ca.) with a specific activity of 5200 mU per mg protein is used. One U is 

30 equivalent to hydrolysis of 1 umole of a synthetic substrate, Ala-Pro-7-amino-4-2 

trifluoromethyl coumarin in one minute at pH 7.8. Enzymatic conversion is carried out by 
incubation of the fusion protein (about 1-100 mg) at a concentration of 1-10 mg/ml with DPP 
IV at 25 degrees C for 30 minutes at an enzyme to substrate ratio of 1:100 (w/w). The desired 
polypeptide is recovered from the uncleaved fusion protein by IMAC. The authenticity is 

35 confirmed by N-terminal sequence analysis. 

Example 6. Processing of bGRF Analog prodrugs in bovine plasma in vitro. 
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Table 1 summarizes representative experiments to demonstrate generation of the core 
peptide, [Leu 27 ]-bGRF(l-29)NH 2 (Seq ID 5) from its three N-terminally extended analogs: 
Tyr^-Ala^-Tyr^-Ala^-JfLeu^J-bGRFCl^P)^} (Seq ID 19), Ile- 2 -Ala-M[Leu 27 ]-bGRF(l- 
29)NH2) (Seq ID 18) and Tyr~ 2 -Ala~M(Leu 27 ]-bGRF(l-29)NH2} (Seq ID 25) upon incubation 
5 with bovine plasma in vitro. The only metabolites detected in the incubation mixtures were 
those which were products of DPP-IV-related cleavages. 

Tyr^-Ala°-Tyr* 2 -Ala' 1 -{[Leu 27 ]-bGRF(l-29)NH 2 } (Seq ID 19) was sequentially 
converted over time to Tyr" 2 -Ala" 1 -{[Leu 27 ]-bGRF(l-29)NH2} (Seq ID 25), [Leu 27 ]-bGRF(l- 
29)NH 2 (core peptide, Seq ID 5) and finally to [Leu 27 ]-bGRF(3-29)NH 2 (Seq. ID 24). 
10 Tyr- 2 -Ala- 1 -{fLeu 27 ]-bGRF(l-29)NH 2 } (Seq ID 25) was converted to [Leu 27 ]-bGRF(l- 

29)NH 2 (core peptide, Seq ID 5) and then to [Leu 27 ]-bGRF(3-29)NH 2 (Seq ID 24). The core 
peptide itself, JLeu 27 ]-bGRF(l-29)NH 2 (Seq ID 5) was converted to [Leu 27 ]-bGRF(3-29)NH 2 
(Seq ID 24) by plasma DPP-IV. 

Even though no other metabolites were observed under the HPLC conditions used in 
15 the experiments, the peptide [Leu 27 ]-bGRF(3-29)NH 2 (Seq ID 24) disappeared over time which 
indicates that other, non-DPP-IV related, proteolyses must have also been taking place but at 
significantly slower rates. Not only was the core bGRF peptide generated from the DPP-IV- 
cleavable bGRF prodrugs shown here but also the half-life of the core protein generated from 
the fusion protein was significantly prolonged in vitro as compared with the half-life of the core 
20 peptide (Seq ID 5) incubated directly with bovine plasma (Table 1). Moreover, the time at 
which the core peptide released from the prodrug is present seems to correlate well with die 
prodrug extension length: Half-life of Seq ID 5 generated from the prodrug with four amino 
acid residues in the extension part (Seq ID 19) was longer than that one derived from proGRFs 
having only two amino acids in the extension like in Seq ID 18 or 25. 
25 Example 7 In vivo and in vitro bioactivity of bGRF Analog prodrugs. 

As shown in Table 2, plasma growth hormone (GH) levels were elevated when Holstein 
steers were injected iv with analog Seq ID 18. At the dose of 0.2 nmol/kg body weight, the 
induced growth homone levels in plasma were comparable to those generated upon iv injection 
with the same dose of the core peptide (Seq ID 5). It is important to stress that the extended 
30 peptide Seq ID 18 had only ca 5% inherent potency of the core peptide Seq ID 5 when both 
were tested in the in vitro pituitary cell assay for GH release. Therefore, the comparable in 
vivo activity of these two peptides seems to indicate that the core peptide could have been 
released from the extended peptide in vivo. 

The same pattern of GH release in vivo was also observed for the treatment with bGRF 
35 prodrugs having four amino acids residues in the extension (Seq ID 19) as shown in Table 3. 
There was no significant difference in the in vivo induced GH release upon treatment with 
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prodrugs having .either 4 or 2 amino acids in the extension (peptides Seq ID 19 and Seq ID 18, 
respectively) despite the significant difference in the in vitro half-life of the core peptide 
generated from these two bGRF prodrugs in vitro as indicated in Table 1 . The in vivo growth 
hormone release was rapid and the same for the core peptide Seq ID 5 as well as for Seq ID 19 
5 and 18, with no difference in the time of the GH peak following the challenge with bGRF 
analogs. 

Our interpretation of these results is that most likely the rapid GH release from 
prodrugs in vivo is due to the overall high tissue and organ DPP-IV levels which are ca. 100 
fold higher than the plasma DPP-IV concentration. This will explain the difference in the rate 

10 of prodrugs processing in vivo as compared with the cleavages observed in the in vitro 
experiments summarized in Table 1. It is also feasible that the half-life of the core peptide 
generated from its prodrug precursors was extended in vivo but not sufficiently to show an 
altered (extended) growth hormone release. It is known that GH release in vivo is modulated by 
the stimulatory effect of bGRF and inhibitory action of somatostatin (somatotropin release 

15 inhibitory factor, SRIF). In die meal-fed steer model of Moseley, et al. (J. Endocr. 17, 253- 
259, 1988), animals are injected iv with GRF two hours prior to feeding for the reason that the 
pituitary is more responsive to a GRF challenge before feeding versus following feeding. 
Factors associated with feeding such as release of gut/pancreatic SRIF may interfere with the 
ability of the pituitary to release GH. In other words, no GH will be released from the 

20 pituitary even in the presence of GRF during the SRIF overtone. Normally, in the 

unchallenged meal-fed steer model, serum GH concentration declines to basal levels for 3 to 6 
hrs after feeding (so called trough period) with another exogenous episodic GH pulse at 5 to 8 
hrs following feeding. In response to GRF injection 2 hrs before feeding, the GH response is 
rapidly occurring within 5-20 min. and GH level remains elevated for 120 to 240 min. before 

25 returning to baseline. In the case of GRF prodrugs tested so far, only the first exogenous GH 
peak was elevated, the second one did not show any treatment effect. It is possible that the 
half-life of the core GRF generated from prodrugs in our experiments was not extended long 
enough to allow for the core peptide to be present in the circulation in sufficient concentrations 
to affect the second exogenous burst of exogenous GH, usually 4-6 hours after the first one. 

30 Taken together, our results support the general prodrug concept disclosed here because: 

(i) bGRF prodrugs having DPP-IV-cleavable N-tenninal extensions were processed successfully 
to produce the core peptide(s) in bovine plasma in vitro via DPP-IV mediated cleavages; (ii) the 
in vitro half-life of the core peptide generated from the prodrugs was significantly longer and 
was a function of the N-terminal extension length in the prodrug; (iii) the fact that the bGRF 

35 prodrugs with very low inherent potency were as effective as the core peptide in the release of 
GH in vivo indicates that most likely the core peptide was generated in vivo as anticipated. 
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Example 8 Preparation of Gly^-Pro^-Ile^-Pro- 1 {[Leu 27 ] bGRF(l-29)NH 2 }, 
trifluoroacetate salt. 
The synthesis of the GRF analog peptide Seq ID 26, having the formula: 
Gly-Pro-fle-Pro-Tyr^ 

5 Ala-Arg-Lys-L^-Leu-Gln-Asp-Ile-Leu-Asn-Arg-NH 2 (as the CF 3 COOH salt) is conducted in a 
stepwise manner as in procedure A which is described in published PCT patent application 
PCTYUS90/02923 incorporated herein by reference. Amino acid analysis, theoretical values in 
parantheses: Asp 4.16 (4); Thr 1.07 (1); Ser 1.81 (2); Glu 2.07 (2), Pro 1.98 (2); Gly 1.99 
(2); Ala 2.99 (3); Val 1.13 (1), He 2.84 (3), Leu 5.08 (5); Tyr 1.93 (2); Phe 0.96 (1); Lys 
10 2.04 (2); Arg 2.97 (3). 

Example 9 Preparation of Tyr^-Ala^-GIy^-Pro^-Ile^-Pro- 1 {[Leu 27 ] bGRF(l-29)NH 2 }, 
trifluoroacetate salt. 

The synthesis of the GRF analog peptide Seq ID 27 which comprises Seq ID 6 as the 
extension portion and which has the formula: 

15 #3H-Tyr-Ala-Gly-Pro-fl^^ 

Gln-I^-Ser-Ala-Ai^-Lys-Leu-I^u^ln-Asp-ne-I^u-Asn-Arg-^2 (as the CF 3 COOH salt) is 
conducted in a stepwise manner as in procedure A which is described in published PCT patent 
application PCT/US90/02923 incorporated herein by reference. Amino acid analysis, 
theoretical values in parantheses: Asp 4.04 (4); Thr 1.03 (1); Ser 1.74 (2); Glu 2.05 (2), Pro 

20 1.99 (2); Gly 2.00 (2); Ala 4.01 (4); Val 1.28 (1), lie 2.84 (3), Leu 5.09 (5); Tyr 2.94 (3); 
Phe 0.97 (1); Lys 2.07 (2); Arg 3.00 (3). 

Example 10 Preparation of Lys^-Pro^-Tyr^-Ala^-Gly^-Pro^-Ile^-Pro- 1 {[Leu 27 ] 
bGRF(l-29)NH2}, trifluoroacetate salt. 
The synthesis of the GRF analog peptide Seq ID 28 which comprises Seq ID 7 as the 
25 extension portion and which has the formula: 
Lys-Pro-Tyr-Ala-Gly-Pro-H^^ 

Gly-Gln-L^u-Ser-Aia-Arg-Lys-Leu-Leu-Gln-Asp-Ile-Leu-Asn-Arg-NH 2 (as the CF3COOH salt) 
is conducted in a stepwise manner as in procedure A which is described in published PCT 
patent application PCT/US90/02923 incorporated herein by reference. Amino acid analysis, 
30 theoretical values in parantheses: Asp 4.04 (4); Thr 0.95 (1); Ser 1.78 (2); Glu 2.04 (2), Pro 
2.91 (3); Gly 1.98 (2); Ala 3.91 (4); Val 0.96 (1), lie 2.86 (3), Leu 5.08 (5); Tyr 3.06 (3); 
Phe 0.97 (1); Lys 3.06 (3); Arg 3.08 (3). 

Example 1 1 Preparation of Gly^-Pro^Tyr-^Ala" 1 {[Leu 27 ] bGRF(l-29)NH 2 }, 
trifluoroacetate salt. 

35 The synthesis of the GRF analog peptide Seq ID 29 having the formula: 

^ Gly-Pro-Tyr-Ala-Tyr-Al^^ 
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Ser-Ala-Arg-Lys-JLeu-Leu-Gln-Asp-Ile-Leu-Asn-Arg-NH 2 (as the CF 3 COOH salt) is conducted 
in a stepwise manner as in procedure A which is described in published PCT patent application 
PCT/US90/02923 incorporated herein by reference. Amino acid analysis, theoretical values in 
parantheses: Asp 4.01 (4); Thr 0.96 (1); Ser 1.80 (2); Glu 2.02 (2); Pro 0.97 (1); Gly 1.98 
5 (2); Ala 3.91 (4); Val 0.99 (1), lie 1.89 (2), Leu 5.08 (5); Tyr 3.05 (3); Phe 0.98 (1); Lys 
2.03 (2); Arg 3.06 (3). 

Example 12 Preparation of Tyr^-Ala^-Gly^-Pro'^Ty^-Ala" 1 {[Leu 27 ] bGRF(l-29)NH 2 }, 
trifluoroacetate salt. 

The synthesis of the GRF analog peptide Seq ID 30 which comprises Seq ID 8 as the 
10 extension portion and which has the formula: 
#7 Tyr-Ala^ly-Pro-Tyr^ 

Gln-Leu-Ser-Ala-Arg-Lys-Leu-Leu-Gln-Asp-Ile-Leu-Asn-Arg-NH 2 (as the CF3COOH salt) is 
conducted in a stepwise manner as in procedure A which is described in published PCT patent 
application PCT/US90/02923 incorporated herein by reference. Amino acid analysis, 
15 theoretical values in parantheses: Asp 4.07 (4); Thr 0.96 (1); Ser 1.79 (2); Glu 2.02 (2); Pro 
0.99 (1); Gly 1.95 (2); Ala 4.80 (5); Val 0.96 (1), De 1.87 (2), Leu 5.09 (5); Tyr 4,11 (4); 
Phe 0.97 (1); Lys 2.06 (2); Arg 3.08 (3). 

Example 13 Preparation of Lys^-Pro^-Tyr^-Ala^-Gly^-Pro^-Tyr^-Ala' 1 {[Leu 27 ] 
bGRF(l-29)NH2}, trifluoroacetate salt. 
20 The synthesis of the GRF analog peptide Seq ID 31 which comprises Seq ID 9 as the 

extension portion and which has the formula: 
#8 Lys-Pro-Tyr-Ala^ly-P^ 

Leu^ly<iln-Leu-Ser-Ala-Arg-Lys^ (as the CF 3 COOH 

salt) is conducted in a stepwise manner as in procedure A which is described in published PCT 
25 patent application PCT/US90/02923 incorporated herein by reference. Amino acid analysis, 
theoretical values in parantheses: Asp 4.06 (4); Thr 0.95 (1); Ser 1.78 (2); Glu 2.01 (2); Pro 
1.95 (2); Gly 1.96 (2); Ala 4.81 (5); Val 0.95 (1), He 1.87 (2), Leu 5.09 (5); Tyr 4.12 (4); 
Phe 0.96 (1); Lys 3.08 (3); Arg 3.10 (3). 

Example 14 Preparation of Phe'^-Ala^-Lys^-Pro^-Tyr^-Ala^^ly^-Pro^-Tyr^-Ala' 1 
30 {[Leu 27 ] bGRF(l-29)NH 2 ) » trifluoroacetate salt. 

The synthesis of the GRF analog peptide Seq ID 32 which comprises Seq ID 10 as the 
extension portion and which has the formula: 
#9 Phe-Ala-Lys-Pro-Tyr-Ala^ly^ 

Lys-Val-I^^ly-Gln-Leu-Ser-Ala-A^ (as the 

35 CF3COOH salt) is conducted in a stepwise manner as in procedure A which is described in 
published PCT patent application PCT/US90/02923 incorporated herein by reference. Amino 
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acid analysis, theoretical values in parantheses: Asp 4.16 (4); Thr 1.01 (1); Ser 1.89 (2); Glu 
2.08 (2); Pro 1.91 (2); Gly 1.93 (2); Ala 5.72 (6); Val 0.97 (1), Be 1.90 (2), Leu 5.09 (5); 
Tyr 4.09 (4); Phe 1.99 (2); Lys 3.08 (3); Arg 3.04 (3). 
Example 15 Preparation of Gly- 12 -Pro- u -Phe- 10 -Al^^ 
5 Tyr^-Ala' 1 {[Leu 27 ] bGRF(l-29)NH 2 }> trifluoroacetate salt. 

Hie synthesis of the GRF analog peptide Seq ID 33 which comprises Seq ID 1 1 as the 
extension portion and which has the formula: 
#10 Gly-Pro-Phe-Ala-Lys-Pro-Tyr-Ate^^ 
Tyr-Arg-Lys-Val-Leu-Gly-Gln-Uu-Se^ 
10 (as the CF 3 COOH salt) is conducted in a stepwise manner as in procedure A which is described 
in published PCT patent application PCT/US90/02923 incorporated herein by reference. 
Amino acid analysis, theoretical values in parantheses: Asp 4.08 (4); Thr 0.96 (1); Ser 1.79 
(2); Glu 2.07 (2); Pro 2.88 (3); Gly 2.94 (3); Ala 5.74 (6); Val 0.96 (1), lie 1.88 (2), Leu 
5.13 (5); Tyr 4.11 (4); Phe 1.99 (2); Lys 3.10 (3); Arg 3.09 (3). 
15 Example 16 Preparation of va" 14 -Pro- 13 -Gly- 12 ^ 

5^jj y -4.p r0 -3. Tyr -2. Ala -l {[Le U 27j bGRF(l-29)NH 2 }, trifluoroacetate salt. 
The synthesis of the GRF analog peptide Seq ID 34 which comprises Seq ID 12 as the 
extension portion and which has the formula: 
#1 1 Val-Pro-Gly-Pro-Phe-Ala-Lys* 
20 Asn-Ser-Tyr-Arg-Lys-Val-Leu^ 

Arg-NH 2 (as the CF 3 COOH salt) is conducted in a stepwise manner as in procedure A which is 
described in published PCT patent application PCT/US90/02923 incorporated herein by 
reference. Amino acid analysis, theoretical values in parantheses: Asp 4.04 (4); Thr 0.96 (1); 
Ser 1.81 (2); Glu 2.04 (2); Pro 3.80 (4); Gly 2.99 (3); Ala 5.92 (6); Val 1.98 (2), De 1.89 (2), 
25 Leu 5.10 (5); Tyr 4.09 (4); Phe 1.99 (2); Lys 3.06 (3); Arg 3.08 (3). 
Example 17 Preparation of Arg^-Pro^-VaT 1 ^ 

Pro-^Tyr^Ala-^ly^Pro^Tyr^Ala- 1 {[Leu 27 ] bGRF(l-29)NH 2 }, 
trifluoroacetate salt. 

The synthesis of the GRF analog peptide Seq ID 35 which comprises Seq ID 13 as the 
30 extension portion and which has the formula: 
#12 Arg-Pro-Val-Pro-Gly-Pro-Phe-Ate^^ 

Phe-Thr-Asn-Ser-Tyr-Arg-Lys^Val-Leu-Gly-Gln-Leu-Ser-Ala-Arg-Lys-Leu-Leu-Gln-Asp-Ue- 
Leu-Asn-Arg-NH 2 * e CF3COOH salt) is conducted in a stepwise manner as in procedure A 
which is described in published PCT patent application PCT/US90/02923 incorporated herein 
35 by reference. Amino acid analysis, theoretical values in parantheses: Asp 4.10 (4); Thr 0.98 
(1); Ser 1.84 (2); Glu 2.03 (2); Pro 4.82 (5); Gly 2.97 (3); Ala 5.91 (6); Val 1.99 (2), He 1.88 
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(2), Leu 5.09 (5); Tyr 4.10 (4); Phe 1.98 (2); Lys 3.03 (3); Arg 4.09 (4). 

Example 18 Preparation of Val^-Ala" 1 {[Leu 27 ] bGRF(l-29)NH 2 }, trifluoroacetate salt. 

The synthesis of the GRF analog peptide Seq ID 36 having the formula: 
#14 Val-Ala-Tyr-Ala-Asp-Ala-Ile-Phe-Thr-Asn-Ser-Tyr-Arg-Lys-Val-Leu-Gly-Gln-Leu-Ser-Ala- 
5 Arg-Lys-Leu-Leu-Gln-Asp-Ile-Leu*Asn-Arg-NH2 (as the CF 3 COOH salt) is conducted in a 
stepwise manner as in procedure A which is described in published PCT patent application 
PCT/US90/02923 incorporated herein by reference. Amino acid analysis, theoretical values in 
parantheses: Asp 3.98 (4); Thr 0.89 (1); Ser 1.76 (2); Glu 2.02 (2); Gly 1.05 (1); Ala 3.87 (4); 
Val 1.85 (2), He 1.77 (2), Leu 5.17 (5); Tyr 2.04 (2); Phe 0.97 (1); Lys 2.07 (2); Arg 3.06 
10 (3). 

Example 19 Preparation of Tyr^-Thr" 1 {[Ala 15 Leu 27 ] bGRF(l-29)NH 2 }, trifluoroacetate 
salt. 

The synthesis of the GRF analog peptide Seq ID 37 having the formula: 
#15 Tyr-Thr-Tyr-Ala-Asp-Ala-Iie-Phe-^ 
15 Arg-Lys*Leu-Leu-Gln-Asp-Ile-Leu-Asn-Arg-NH 2 (as the CF 3 COOH salt) is conducted in a 
stepwise manner as in procedure A which is described in published PCT patent application 
PCT/US90/02923 incorporated herein by reference. Amino acid analysis, theoretical values in 
parantheses: Asp 4.06 (4); Thr 1.86 (2); Ser 1.77 (2); Glu 2.07 (2), Ala 3.98 (4); Val 1.08 (1), 
He 1.89 (2), Leu 5.14 (5); Tyr 2.94 (3); Phe 0.96 (1); Lys 1.99 (2); Arg 3.04 (3). 
20 Example 20 Preparation of Tyr^-Thr* 1 {[He 2 Ala 15 Leu 27 ] bGRF(l-29)NH 2 }, 
trifluoroacetate salt. 
The synthesis of the GRF analog peptide Seq ID 38 having the formula: 
#16 Tyr-Thr-Tyr-Ue-Asp-AJa-Ue-P^ 

Arg-Lys-Leu-Leu-Gln-Asp-Ile-Leu-Asn-Arg-NH 2 (as the CF3COOH salt) is conducted in a 
25 stepwise maimer as in procedure A which is described in published PCT patent application 
PCT/US90/02923 incorporated herein by reference. Amino acid analysis, theoretical values in 
parantheses: Asp 4.07 (4); Thr 1.87 (2); Ser 1.75 (2); Glu 2.07 (2), Ala 2.94 (3); Val 1.09 (1), 
lie 2.87 (3), Leu 5.12 (5); Tyr 2.92 (3); Phe 0.96 (1); Lys 2.00 (2); Arg 3.05 (3). 
Example 21 Preparation of Tyr^-Thr" 1 {[Thr 2 Ala 15 Leu 27 ] bGRF(l-29)NH 2 }, 
30 trifluoroacetate salt. 

The synthesis of the GRF analog peptide Seq ID 39 having the formula: 
#17 Tyr-Thr-Tyr-Thr-Asp-Ala-Ile-Phe-Thr-As 

Arg-Lys-Leu-Leu-Gln-Asp-Ile-Leu-Asn-Arg-NH 2 (as the CF3COOH salt) is conducted in a 
stepwise manner as in procedure A which is described in published PCT patent application 
35 PCT/US90/02923 incorporated herein by reference. Amino acid analysis, theoretical values in 
parantheses: Asp 4.05 (4); Thr 2.68 (3); Ser 1.77 (2); Glu 2.07 (2), Ala 2.90 (3); Val 1.08 (1), 
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Be 1.89 (2), Leu .5.20 (5); Tyr 2.87 (3); Phe 0.93 (1); Lys 2.01 (2); Arg 3.07 (3). 
Example 22 Preparation of Tyr^-Ser" 1 {[Thr 2 Ala 15 Leu 27 ] bGRF(l-29)NH 2 }, 
trifluoroacetate salt. 
The synthesis of the GRF analog peptide Seq ID 40 having the formula: 
5 #18 Tyr-Ser-Tyr-Thr-Asp-Ala-Ile-Ph 

Arg-Lys-Leu-Leu-Gln-Asp-ne-Leu-Asn-Arg-NH 2 (as the CF 3 COOH salt) is conducted in a 
stepwise manner as in procedure A which is described in published PCT patent application 
PCT/US90/02923 incorporated herein by reference. Amino acid analysis, theoretical values in 
parantheses: Asp 4.11 (4); Thr 1.82 (2); Ser 2.64 (3); Glu 2.05 (2), Ala 2.90 (3); Val 1.04 (1), 
10 lie 1.87 (2), Leu 5.16 (5); Tyr 2.92 (3); Phe 0.94 (1); Lys 2.01 (2); Arg 3.04 (3). 

Example 23 Preparation of Tyr<nir 3 -Tyr 2 -Hir 1 {[Hir 2 Ala 15 Leu 27 ] bGRF(l-29)NH 2 }, 
trifluoroacetate salt. 
The synthesis of the GRF analog peptide Seq ID 41 having the formula: 
#19 Tyr-Thr-Tyr-Thr-T^ 
15 Leu-Ser-Ala-Arg-Lys-Leu-Leu-Gln-Asp-IIe-Leu-Asn-Arg-NH 2 (as the CF3COOH salt) is 

conducted in a stepwise manner as in procedure A which is described in published PCT patent 
application PCT/US90/02923 incorporated herein by reference. Amino acid analysis, 
theoretical values in parantheses: Asp 4.06 (4); Thr 3.66 (4); Ser 1.85 (2); Glu 2.05 (2), Ala 
2.93 (3); Val 1.09 (1), lie L91 (2), Leu 5.15 (5); Tyr 3.91 (4); Phe 0.95 (1); Lys 2.00 (2); 
20 Arg 3.04 (3). 

Example 24 Preparation of Tyr^-Ala" 1 {[Leu 27 ] bGRF(l-29)NH 2 }, trifluoroacetate salt. 

The synthesis of the GRF analog peptide Seq ID 42 having the formula: 
Tyr-Ala-Tyr-Ala-Asp-Ala41e-Ph 

Lys-Leu-Leu-GIn-Asp-De-Leu-Asn-Arg-NH 2 (as the CF3COOH salt) is conducted in a stepwise 
25 manner as in procedure A which is described in published PCT patent application 

PCT/US90/02923 incorporated herein by reference. Amino acid analysis, theoretical values in 
parantheses: Asp 4.01 (4); Thr 0.97 (1); Ser 1.88 (2); Glu 2.00 (2); Gly 1.02 (1); Ala 3.88 (4); 
Val 0.97 (1), He 1.86 (2), Leu 5.03 (5); Tyr 3.03 (3); Phe 0.96 (1); Lys 2.18 (2); Arg 3.03 
(3). 

30 Example 25 Preparation of Tyr^-Ala^-Tyr^-Ala' 1 {[Leu 27 ] bGRF(l-29)NH 2 }, 
trifluoroacetate salt. 
The synthesis of the GRF analog peptide Seq ID 43 having the formula: 
#13 Tyr-Ala-Tyr-Ala-Tyr^ 

Ser-Ala-Arg-Lys-Leu-Leu-Gln-Asp-Ile-Leu-Asn-Arg-NH 2 (as the CF3COOH salt) is conducted 
35 in a stepwise manner as in procedure A which is described in published PCT patent application 
PCT/US90/02923 incorporated herein by reference. Amino acid analysis, theoretical values in 
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parantheses: Asp 3.97 (4); Thr 0.90 (1); Ser 1.74 (2); Glu 1.98 (2); Gly 1.04 (1); Ala 4.85 (5); 
Val 0.91 (1), He 1.77 (2), Leu 5.13 (5); Tyr 4.14 (4); Phe 0.99 (1); Lys 2.07 (2); Arg 3.05 
(3). 
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Table 1. In.vitro potency and in vitro plasma stability of selected GRF analogs. 



Peptide Sequence 


Seq. 
ID No. 


In Vitro 
Potency* 


' — 1 " 

In Vitro 
plasma 

(min.) 1 


|l*eu J -DO KT Xl2 


5 


1.00 


24.8 
(Expl) 


De^-Pro" 1 {[Leu 27 ]-bGRI : (l-29)NH2} 


18 


0.045 


43.2 
(Exp 1) 


Tyr^-Ala-MtLeu^l-bGRFd^NHz} 


25 


0.13 


38.5 
(Exp 1) 


Tyr<Ala- 3 -Tvr- 2 -Ala- 1 {[Leu 27 ]-bGRF(l-29)NH 2 } 


19 


0.052 


297. 1# 
(Exp 2) 



* Peptides were tested in an in vitro bovine anterior pituitary cell culture as described by 
10 Friedman et al. (Int. J. Peptide and Protein Res. 37:14-20 [1991]). 

** Peptides were incubated at 30 /iM in bovine plasma in vitro at 37°C as described in 
Kubiak et al. (Drug Met. Disp. 17:393-397 [1989]). Values presented here relate to 
the half-life of [Leu 27 ]-bGRF(l-29)NH 2 (Seq. ID 5) incubated directly in plasma or 
15 [Leu 27 ]-bGRF(l-29)NH 2 (Seq. ID 5) generated from extended peptides Seq. ID No. 

18, 25 or 19, respectively. Two experiments with two different plasma pools were run, 
Exp 1 and 2 as indicated in parentheses. 

# Peptide Seq. ID 19 was tested against [Leu 27 ]-bGRF(l-29)NH 2 (Seq. ID 5) using a 
20 different bovine plasma pooh The half-life of Seq. ID 5 in this plasma specimen was 

50.2 min. 



25 
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Table 2. Serum GH Response to IV Injections of Various Doses of [Leu 27 ]-bGRF(l- 
29)NH 2 (Seq ID 5) and Ile- 2 -Pro- 1 {[Leu 27 ]-bGRF(l-29)NH 2 } (Seq ID 18) in 
Meal-Fed Holstein Steers.* 



Treatment 


Dose 
nmoi/kg 


Number 
of 

Animals 
Respond- 
ing 


Peak Height 
(ng/ral) 


Time to 

Peak 

(min) 


Area 0-8 h 
(Unit) 


A 1 


B* 


A* 




A* 


B» 


Saline 


0 


or 


32.4 D 


314° 


89° 


89 b 


4.3° 


43 b 


Seq ID 18 


0.02 


8/10* 


71.2°' 


76.9 W 


23 c 


18* 


4.6 0X 


5.0 W 


Seq ID 5 


020 


9/l0 c 


119^ 


130.9* 


23 c 


14 e 


6.8° 


7.0° 


Seq ID 18 


020 


10/ Iff 


10L4 C *° 


101.4 C 


26 c 


26 e 


63 e0 


' 63 c<! ~ 


Seq ID 18 


20.0 


10/1CF 


137.8° 


137.8* 


23 c 


23 c 


10.1 c 


io.r 


SEM 




.04 


9.1 


9.4 


8 


8 


3 


'3 


0 p Value 




.0001 


.007 


.007 


.04 


.03 


.0001 


.0001 



• Animals were injected IV with peptides at the doses indicated 2 hrs before feeding 
and procedures were as described by Moseley et al. J. Endocrinology 117:253-259 
(1988). 



' Analysis A includes all steers and Analysis B includes only steers responding to GRF 
injection and control steers. 

b*Ae Values with different superscripts in a column are significantly different (P<.05). 
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Table 3 Serum GH Response to IV Injections of Various Doses of [Leu 27 ]-bGRF(l- 

29)NH 2 ( Se< i ID 5 ) 311(1 Tyf 4 -Ala- 3 -Tyr- 2 -Ala- 1 {[Leu 27 ]-bGRF(l-29)NH 2 } 
(Seq ID 19) in Meal-Fed Holstein Steers.* 



Treat- 
ment 

8 Saline 


Dose 
nmol/kg 


Number 
of 

Animals 
Respond- 
ing 


Peak Height 
(ng/ml) 


Time to 
Peak (min) 


Area 0-10 h 
(Unit) 


A' 


B* 


A 


B 


A 


B 




0 




30.9" 


30.9 B 


45 b 


45 b 


3.9° 


3.9° 


Seq IDS 


0.02 


9/12 w 


91.9 s 


10W 


41 b 


20* 


4.7" 


4.7°* 


Seq ID 19 


0.02 


10/12'' a 


88.5* 


92.4' 


30 1 


22' 


4.8* 


4.4°* 


1 Seq ID 5 


0.20 


11/12" 


793* 


79.1 M 


21 b 


18 C 


5.4' 


S3' 


I Seq ID 19 


O20 


7/12* 


97.4 C 


120.1 e 


63 b 


8 C 


5.4 C 


5.4 M 


| Seq ID 19 


2.0 


9/12 w 


74.5 s 


70.8"* 


24 b 


18 c 


6.9" 


6.8° 


SEM 




.10 


15.9 


17.3 


14 


8 


.4 


.4 


1 p Value 




.0001 


.06 


.06 


28 


.11 


.0008 


.007 


EMS 




.1117 


3042 


3623 


2260 


763 


2281 


2483 



Steers were injected IV with peptides at the doses indicated 2 hrs before feeding and 
procedures were as described by Moseley et aL J. Endocrinology 1 17:253-259 (1988). 



* Analysis A includes all steers and Analysis B includes only steers responding to GRF 
injection and control steers. 

bAd Values with different superscripts in a column are significantly different (P<.05). 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Kubiak, Teresa M. 
5 Sharma, Satish K. 

(ii) TITLE OF INVENTION: Fusion Polypeptides 
(iii) NUMBER OF SEQUENCES : 42 
(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Upjohn Company - Corp. Patents & Trademarks 
10 (B) STREET: 301 Henrietta Street 

(C) CITY: Kalamazoo 

(D) STATE: Michigan 

(E) COUNTRY: USA 

(F) ZIP: 49001 

IS (V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: diskette (3M 3.5, DS double side 1.0 MB) 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: WordPerfect 5.1 
20 (vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: _ 
(vii) PRIOR APPLICATION DATA: 

25 (A) APPLICATION NUMBER: US07/626,727 

(B) FILING DATE: 13/12/90 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US07/614,170 

(B) FILING DATE: 14/11/90 
30 (vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US90/02923 

(B) FILING DATE: 30/05/90 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US07/368,231 
35 (B) FILING DATE: 16/06/89 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US07/506,605 

(B) FILING DATE: 09/04/90 
(viii) ATTORNEY /AGENT INFORMATION: 

40 (A) NAME: DeLuca, Mark 

(B) REGISTRATION NUMBER: 33229 

(C) REFERENCE/DOCKET NUMBER: 4595 
(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 616 385 5210 
45 (B) TELEFAX: 616 385 6897 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 33 

(B) TYPE: amino acid 
5 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO :1s 

Tyr Val Asp Ala He Phe Thr Ser Ser Tyr Arg Lys Val Leu Ala Gin 
1 5 10 15 



10 



Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Ser Arg Gin Gin Gly 
20 25 30 



Glu 

15 

(3) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTIC: 
(A) LENGTH: 40 
20 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Tyr He Asp Ala He Phe Thr Ser Ser Tyr Arg Lys Val Leu Ala Gin 
25 1 5 10 15 

Leu Ser Ala Arg Lye Leu Leu Gin Asp He Leu Ser Arg Gin Gin Gly 
20 25 30 

30 Glu Arg Asn Gin Glu Gin Gly Ala 
35 40 



(4) INFORMATION FOR SEQ ID NO: 3: 
35 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 29 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ix) FEATURE: 

40 (A) NAME/KEY: C-terminally araidated Argininyl residue 

(B) LOCATION : Xaa29 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Tyr Ala Asp Ala He Phe Thr Asn Ser Tyr Arg Lys Val Leu Ala Gin 
45 1 5 10 15 
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Leu Ser Ala Arg Lya Leu Leu Gin Asp lie Leu Asn Xaa 
20 25 



5 (5) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 29 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

10 (ix) FEATURE: 

(A) NAME/KEY: C- terminally ami dated Argininyl residue 

(B) LOCATION: Xaa29 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

15 Tyr He Asp Ala He Phe Thr Asn Ser Tyr Arg Lys Val Leu Ala Gin 
15 10 15 

Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Asn Xaa 
20 25 

20 

(6) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTIC: 
(A) LENGTH: 29 
25 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(ix) FEATURE: 

(A) NAME /KEY : C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa29 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Tyr Ala Asp Ala He Phe Thr Asn Ser Tyr Arg Lys Val Leu Gly Gin 
1 5 10 15 

35 Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Asn Xaa 
20 25 



(7) INFORMATION FOR SEQ ID NO: 6: 
40 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 6 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

45 

Tyr Ala Gly Pro He Pro 
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(8) INFORMATIOH FOR SEQ ID NO: 7: 
5 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH : 8 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

10 

Lys Pro Tyr Ala Gly Pro lie Pro 
1 5 



15 (9) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 6 

(B) TYPEs amino acid 
(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Tyr Ala Gly Pro Tyr Ala 
1 5 

25 

(10) INFORMATION FOR SEQ ID NO: 9: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 8 

(B) TYPE: amino acid 
30 (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Lys Pro Tyr Ala Gly Pro Tyr Ala 
1 5 

35 

(11) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 10 
40 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala 
45 1 5 10 
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(12) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 12 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11; 

Gly Pro Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala 
1 5 10 



(13) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 14 
15 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Val Pro Gly Pro Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala 
20 1 5 10 

(14) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 16 
25 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Arg Pro Val Pro Gly Pro Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala 
30 1 5 10 15 



(15) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTIC: 
35 (A) LENGTH: 29 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 
(ix) FEATURE: 

(A) NAME /KEY: C-terminally amidated Argininyl residue 
40 (B) LOCATION: Xaa29 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



45 



Tyr Thr Asp Ala lie Phe Thr Asn Ser Tyr Arg Lys Val Leu Ala Gin 
1 5 10 15 

Leu Ser Ala Arg Lys Leu Leu Gin Asp lie Leu Asn Xaa 
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20 25 



(16) INFORMATION FOR SEQ ID NO: 15: 
5 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 11 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

10 

Met Pro Ala His Pro His Pro His Pro His Ala 
15 io 



15 (17) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 11 

(B) TYPEs amino acid 
(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Ala Pro His Ala His Ala His Ala His Ala 
1 5 10 

25 

(18) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 11 

(B) TYPE: amino acid 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



35 



Met Gly Pro His Pro His Pro His Pro His Ala 
1 5 io 



(19) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTIC: 
(A) LENGTH: 31 
40 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(ix) FEATURE: 

(A) NAME /KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa31 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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Ile Pro Tyr Ala Asp Ala lie Fhe Thr Asn Ser Tyr Arg Lys Val Leu 
15 10 15 

Gly Gin Leu Ser Ala Arg Lye Leu Leu Gin Asp lie Leu Asn Xaa 
S 20 25 30 



(20) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTIC: 
10 (A) LENGTH: 33 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 
( ix ) FEATURE : 

(A) NAME/KEY: C-terminally amidated Argininyl residue 
IS (B) LOCATION: Xaa33 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Tyr Ala Tyr Ala Tyr Ala Asp Ala lie Phe Thr Ser Ser Tyr Arg Lys 
15 10 15 



20 



Val Leu Ala Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp lie Leu Ser 
20 25 30 



Xaa 

25 

(21) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTIC: 
(A) LENGTH: 39 
30 (B) TYPE: amino acid 

( D ) TOPOLOGY : 1 inear 
( ix ) FEATURE : 

(A) NAME /KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa 3 9 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala Tyr Ala Asp Ala He Phe 
1 5 10 15 

40 Thr Asn Ser Tyr Arg Lys Val Leu Ala Gin Leu Ser Ala Arg Lys Leu 
20 25 30 

Leu Gin Asp He Leu Asn Xaa 
35 

45 
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(22) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 45 

(B) TYPE: amino acid 
( D ) TOPOLOGY : linear 

(ix) FEATURE: 

(A) NAME/KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa4§ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Arg Pro Val Pro Gly Pro Phe Ala Lye Pro Tyr Ala Gly Pro Tyr Ala 
15 10 15 



Tyr Ala Aep Ala He Phe Thr Asn Ser Tyr Arg Lys Val Leu Gly Gin 
15 20 25 30 

Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Asn Xaa 
35 40 45 

20 

(23) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 10 

(B) TYPE: amino acid 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala 
1*5 10 

30 

(24) INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTIC : 

(A) LENGTH: 16 
35 (B) TYPE: amino acid 

( D ) TOPOLOGY : linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Arg Pro Val Pro Gly Pro Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala 
40 1 5 10 15 



(25) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTIC; 
45 (A) LENGTH: 27 

(B) TYPE: amino acid 
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(D) TOPOLOGY: lin ar 
(ix) FEATURE: 

(A) NAME/KEY: C-terminally amid at ed Argininyl residue 

(B) LOCATION: Xaa27 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Asp Ala He Phe Thr Asn Ser Tyr Arg Lys Val Leu Gly Gin Leu Ser 
15 10 15 

10 Ala Arg Lys Leu Leu Gin Asp He Leu Asn Xaa 
20 25 



(26) INFORMATION FOR SEQ ID NO: 25: 
15 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 31 

(B) TYPE: amino acid 
(D } TOPOLOGY : linear 

(ix) FEATURE: 

20 (A) NAME /KEY : C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa31 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Tyr Ala Tyr Ala Asp Ala He Phe Thr Ser Ser Tyr Arg Lys Val Leu 
25 1 5 10 15 

Gly Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Asn Xaa 
20 25 30 

30 

(27) INFORMATION FOR SEQ ID NO: 26: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 33 

(B) TYPE: amino acid 
35 (D) TOPOLOGY : 1 inear 

( ix ) FEATURE : 

(A) NAME/KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa33 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



40 



Gly Pro He Pro Tyr Ala Asp Ala He Phe Thr Asn Ser Tyr Arg Lys 
15 10 15 



Val Leu Gly Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Asn 
45 20 25 30 
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(28) INFORMATION FOR SEQ ID NO: 27: 

<i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 35 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ix) FEATURE: 

(A) NAME /KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



Tyr Ala Gly Pro He Pro Tyr Ala Asp Ala He Phe Thr Asn Ser Tyr 
15 1 5 10 15 

Arg Lys Val Leu Gly Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp He 
20 25 30 



20 Leu Asn Xaa 
35 



(29) INFORMATION FOR SEQ ID NO: 28: 
25 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 37 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ix) FEATURE: 

30 (A) NAME /KEY: C-terminaliy amidated Argininyl residue 

( B ) LOCATION : Xaa37 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

Lys Pro Tyr Ala Gly Pro He Pro Tyr Ala Asp Ala He Phe Thr Asn 
35 1 5 10 15 

Ser Tyr Arg Lys Val Leu Gly Gin Leu Ser Ala Arg Lys Leu Leu Gin 
20 25 30 

40 Asp He Leu Asn Xaa 
35 



(30) INFORMATION FOR SEQ ID NO:29: 
45 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 33 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
( ix ) FEATURE : 

(A) NAME/ KEY : C-terminalXy amidated Argininyl residue 

(B) LOCATION: Xaa33 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Gly Pro Tyr Ala Tyr Ala Asp Ala lie Phe Thr Asn Ser Tyr Arg Lys 
15 10 15 

Val Leu Gly Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp lie Leu Asn 
20 25 30 



Xaa 



15 



(31) INFORMATION FOR SEQ ID NO: 30: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 35 
20 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
( ix ) FEATURE : 

(A) NAME/KEY: C- terminally amidated Argininyl residue 

(B) LOCATION: Xaa35 

25 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

Tyr Ala Gly Pro Tyr Ala Tyr Ala Asp Ala lie Phe Thr Asn Ser Tyr 
15 10 15 

30 Arg Lys Val Leu Gly Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp lie 
20 25 30 

Leu Asn Xaa 
35 

35 

(32) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 37 

(B) TYPE: amino acid 
40 (D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : c-terminally amidated Argininyl residue 

(B) LOCATION: Xaa37 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

45 

Lys Pro Tyr Ala Gly Pro Tyr Ala Tyr Ala Asp Ala lie Phe Thr Asn 
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15 



Ser Tyr Arg Lys Val Leu Gly Gin Leu Ser Ala Arg Lys Leu Leu Gin 
20 25 30 



Asp lie Leu Asn Xaa 
35 



10 (33) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 39 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

15 (ix) FEATURE: 

(A) NAME/KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa39 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32: 

20 Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala Tyr Ala Asp Ala He Phe 
1 5 10 15 

Thr Asn Ser Tyr Arg Lys Val Leu Gly Gin Leu Ser Ala Arg Lys Leu 
20 25 30 

25 

Leu Gin Asp He Leu Asn Xaa 
35 



30 (34) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 41 

(B) TYPE: amino acid 
(D ) TOPOLOGY : linear 

35 (ix) FEATURE: 

(A) NAME /KEY : C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa41 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

40 Gly Pro Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala Tyr Ala Asp Ala 
15 10 15 

He Phe Thr Asn Ser Tyr Arg Lys Val Leu Gly Gin Leu Ser Ala Arg 
20 25 30 

45 

Lys Leu Leu Gin Asp II Leu Asn Xaa 
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(35) INFORMATION FOR SEQ ID NO: 34: 
5 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 43 

(B) TYPE: amino acid 
( D ) TOPOLOGY : linear 

(ix) FEATURE: 

10 (A) NAME/KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa43 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Val Pro Gly Pro Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala Tyr Ala 
15 1 5 10 15 

Asp Ala lie Phe Thr Asn Ser Tyr Arg Lys val Leu Gly Gin Leu Ser 
20 25 30 

20 Ala Arg Lys Leu Leu Gin Asp lie Leu Asn Xaa 
35 40 



(36) INFORMATION FOR SEQ ID NO: 35: 
25 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 45 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ix) FEATURE: 

30 (A) NAME /KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa45 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Arg Pro Val Pro Gly Pro Phe Ala Lys Pro Tyr Ala Gly Pro Tyr Ala 
35 1 5 10 15 

Tyr Ala Asp Ala lie Phe Thr Asn Ser Tyr Arg Lys Val Leu Gly Gin 
20 25 30 

40 Leu Ser Ala Arg Lys Leu Leu Gin Asp lie Leu Asn Xaa 
35 40 45 



(37) INFORMATION FOR SEQ ID NO: 36: 
45 (i) SEQUENCE CHARACTERISTIC : 

(A) LENGTH: 31 
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(B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(ix) FEATURE: 

(A) NAME/KEY: C-terminally amidated Argininyl residue 
5 (B) LOCATION: Xaa31 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

vai Aia Tyr Aia Asp Ala He Phe Thr Asn Ser Tyr Arg Lys Val Leu 
15 10 15 

10 

Gly Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Asn Xaa 
20 25 30 



15 (38) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 31 

(B) TYPE: amino acid 
( D ) TOPOLOGY : linear 

20 (ix) FEATURE: 

(A) NAME/KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa31 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

25 Tyr Thr Tyr Ala Asp Ala He Phe Thr Asn Ser Tyr Arg Lys Val Leu 
15 10 15 

Ala Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Asn Xaa 
20 25 30 

30 

(39) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTIC: 
(A) LENGTH: 31 
35 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(ix) FEATURE: 

(A) NAME/KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa31 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Tyr Thr Tyr He Asp Ala He Phe Thr Asn Ser Tyr Arg Lys Val Leu 
1 5 10 15 

45 Ala Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Asn Xaa 
20 25 30 
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(40) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTIC: 
(A) LENGTH: 31 
5 (B) TYPE: amino acid 

( D ) TOPOLOGY : linear 
(ix) FEATURE: 

(A) NAME /KEY : C-terminally amidated Argininyl residue 

( B ) LOCATION : Xaa31 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Tyr Thr Tyr Thr Asp Ala lie Phe Thr Asn Ser Tyr Arg Lys Val Leu 
15 10 15 

15 Ala Gin Leu Ser Ala Arg Lya Leu Leu Gin Asp lie Leu Asn Xaa 
20 25 30 



(41) INFORMATION FOR SEQ ID NO: 40: 
20 (i> SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 31 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ix) FEATURE: 

25 (A) NAME/KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa31 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Tyr Ser Tyr Thr Asp Ala lie Phe Thr Asn Ser Tyr Arg Lys Val Leu 
30 1 5 10 15 

Ala Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp lie Leu Asn Xaa 
20 25 30 

35 

(42) INFORMATION FOR SEQ ID NO: 41: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 33 

(B) TYPE: amino acid 
40 (0) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa33 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 



45 



Tyr Thr Tyr Thr Tyr Thr Asp Ala lie Phe Thr Asn Ser Tyr Arg Lys 
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5 



35 



1 # 5 10 15 

Val Leu Ala Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp lie Leu Asn 
20 25 30 



Xaa 



(43) INFORMATION FOR SEQ ID NO: 42: 
10 (i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 31 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ix) FEATURE: 

15 (A) NAME /KEY: C-terminally araidated Argininyl residue 

(B) LOCATION: Xaa31 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Tyr Ala Tyr Ala Asp Ala He Phe Thr Ser Ser Tyr Arg Lys Val Leu 
20 1 5 10 15 

Gly Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Asn Xaa 
20 25 30 

25 

(44) INFORMATION FOR SEQ ID NO: 43: 
(i) SEQUENCE CHARACTERISTIC: 

(A) LENGTH: 33 

(B) TYPE: amino acid 
30 (D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: C-terminally amidated Argininyl residue 

(B) LOCATION: Xaa33 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



Tyr Ala Tyr Ala Tyr Ala Asp Ala He Phe Thr Ser Ser Tyr Arg Lys 
1 5 10 15 



Val Leu Ala Gin Leu Ser Ala Arg Lys Leu Leu Gin Asp He Leu Ser 
40 20 25 30 



Xaa 
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CLAIMS 

1. A non-naturally-occurring fusion protein comprising an extension peptide portion 
covalently linked at its C-terminus to the N-terminus of a core protein portion, said extension 

5 peptide portion being of the formula: 
A-X-Y(X'-Y) n 

wherein 

A is optional and when present is methionine; 
n is 0-20; 

10 X is selected from the group consisting of all naturally occurring amino acid residues; 

X' is selected from the group consisting of all naturally occurring amino acid residues 
except proline and hydroxyproline; 

Y is selected from the group consisting of proline, hydoxyproline, alanine, serine and 
threonine except when n is zero and A is absent then Y is selected from the group consisting of 
IS alanine, serine and threonine. 

2. A non-naturally-occurring fusion protein according to claim 1 wherein A is present and 
X is selected from the group consisting of Pro, Gly, Ala and Ser. 

20 3. A non-naturally-occurring fusion protein according to Claim 1 wherein n is 0-10. 

4. A non-naturally-occurring fusion protein according to claim 1 wherein said biologically 
active polypeptide is selected from the group consisting of: bGRF analogs, EGF; IGF-2, 
glucagon; corticotropin releasing factor; dynorfin, somatostatin- 14; endothelin; transforming 

25 growth factor a; Vasoactive Intestinal Peptide; human 0-casomorphin; Gastric Inhibitory 
Peptide; Gastric Releasing Peptide; human Peptide HI; human Peptide YY; glucagon-like 
peptide- 1 fragment 7-37; glucagon-like peptide-2; substance P; Neuropeptide Y; human 
Pancreatic Polypeptide; insulin-like growth factor-1; human growth hormone; bovine growth 
hormone; porcine growth hormone; prolactin; human growth hormone releasing factor; bovine 

30 growth hormone releasing factor; porcine growth hormone releasing factor; ovine growth 
hormone releasing factor; interieukin -10; and interleukin-2. 

5. A non-naturally-occurring fusion protein according to claim 1 wherein said extension 
peptide portion is selected from the group consisting of Gly-Pro-Ile-Pro, Seq ID 6, Seq ID 7, 

35 Tyr-Ala, Gly-Pro-Tyr-Ala, Seq ID 8, Seq ID 9, Seq ID 10, Seq ID 11, Seq ID 12, Seq ID 13, 
Tyr-Ala-Tyr-Ala, Val-Ala, Seq ID 15, Seq ID 16, Seq ID 17, Seq ID 22 and Seq ID 23. 
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A non-naturally-occurring fusion protein according to claim 2 wherein n is 3-5. 



7. A non-naturally-occurring fusion protein according to claim 6 wherein all X' residues 
are histidines. 

5 

8. A non-iiatuTally-occurring fusion protein according to claim 7 wherein X is a histidine. 

9. A non-naturally-occurring fusion protein according to claim 7 wherein three Y residues 
are proline. 

10 

10. Use of a non-naturally-occurring fusion protein according to claim 1 to prepare a 
medicament. 

11. A use according to claim 10, wherein said medicament comprises additional fusion 
15 proteins having identical biologically active portions and different extension portions. 

12. A use according to claim 10 wherein said biologically active portion of said non- 
naturally-occurring fusion protein is a bGRF analog. 

20 13. A method of purifying desired proteins from a mixture containing a non-naturally- 
occurring fusion protein according to claim 1 and impurities comprising the steps of: 

selectively contacting said fusion protein with material which immobilized said fusion 

protein; 

removing said impurities; 
25 separating said fusion proteins from said material; 

combining said fusion protein with DPP IV; and 
isolating said desired protein. 

14. A method according to claim 13 wherein said material is fixed in a column. 

30 

15. A method according to claim 13 wherein said material is an antibody which binds to 
said extension portion. 

16. A method according to claim 13 wherein said material is immobilized metal ions and 
35 said extension portion comprises at least 3 consecutive X' residues that are histidines. 
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